Biology (Basel)Biology (Basel)biologyBiology2079-7737MDPI25811640449830010.3390/biology4020282biology-04-00282ArticleNPPD: A Protein-Protein Docking Scoring Function Based on Dyadic Differences in Networks of Hydrophobic and Hydrophilic Amino Acid ResiduesShihEdward S. C.HwangMing-Jing*BergThorstenAcademic EditorInstitute of Biomedical Sciences, Academia Sinica, Nankang, Taipei 115, Taiwan; E-Mail: shihds@gate.sinica.edu.twAuthor to whom correspondence should be addressed; E-Mail: mjhwang@ibms.sinica.edu.tw; Tel.: +886-2-27899033; Fax: +886-2-27887641.24320156201542282297271120141632015© 2015 by the authors; licensee MDPI, Basel, Switzerland.2015This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).

Protein-protein docking (PPD) predictions usually rely on the use of a scoring function to rank docking models generated by exhaustive sampling. To rank good models higher than bad ones, a large number of scoring functions have been developed and evaluated, but the methods used for the computation of PPD predictions remain largely unsatisfactory. Here, we report a network-based PPD scoring function, the NPPD, in which the network consists of two types of network nodes, one for hydrophobic and the other for hydrophilic amino acid residues, and the nodes are connected when the residues they represent are within a certain contact distance. We showed that network parameters that compute dyadic interactions and those that compute heterophilic interactions of the amino acid networks thus constructed allowed NPPD to perform well in a benchmark evaluation of 115 PPD scoring functions, most of which, unlike NPPD, are based on some sort of protein-protein interaction energy. We also showed that NPPD was highly complementary to these energy-based scoring functions, suggesting that the combined use of conventional scoring functions and NPPD might significantly improve the accuracy of current PPD predictions.

protein-protein dockingscoring functiondyadicityamino acid network
1. Introduction

Living cells are a crowded environment in which most proteins interact with other proteins to exert cellular functions. To understand how protein-protein interactions mediate cellular processes, scientists often need to describe the structures of protein complexes at the atomic level. However, due to the difficulty in determining the atomic structures of protein complexes using experimental methods, protein-protein docking (PPD), a computational approach, is often used to complement results from experimental studies [1].

Most methods for PPD predictions involve a two-step strategy, sampling and scoring. For sampling, numerous docking models, also referred to as docking poses or decoys, are often generated from a global search of all possible relative orientations of, and separations between, two proteins that are brought together to form a complex, then these docking poses are ranked by a scoring function. To evaluate the performance of a given scoring function for a set of protein complexes, the TopN success rate is usually employed, in which a “success” hit for a complex is defined as when at least one of its top N docking poses, as ranked by the scoring function, satisfies a specified criterion for being a good (i.e., near-native) model. It follows that, for a given scoring function, a higher success rate (i.e., a higher number of correctly predicted complexes) can be obtained by choosing to compute the success rate at a larger N, since, for a given complex, there will be more poses and, thus, a higher probability of at least one being considered good. The objective when developing a good PPD scoring function is, therefore, to rank good poses as high and bad poses as low. However, despite significant progress in recent years, this is still an active area of research [2,3], as success rates are still low when small values of N are used (e.g., using a stringent criterion, Top1 and Top10 success rates are, respectively, generally below 10% and 20%), unless dockings are guided by experimentally-derived data or information [4,5].

Most PPD scoring functions use a set of mathematical equations to compute the energy resulting from the formation of the protein complex. To do so, many use molecular mechanics functions [6,7,8,9,10,11,12,13,14,15,16], while others use statistical mechanics methods to derive potentials from various sources, including experimentally-determined protein structures [8,10,17,18,19], docking decoys [6,20,21,22,23], homology models [24,25,26], or binding energy funnels [27,28]. Many non-energy-based PPD scoring functions have also been developed, including those that utilize bioinformatics-predicted information [29,30], shape complementarity [31,32], machine learning [33,34,35], coevolution [36], and amino acid networks (AANs) [37,38].

As described in the Experimental Section below, NPPD, the network-based PPD scoring function developed in this work, is based on AANs, which have also been referred to as residue contact networks [39], protein contact networks [40], protein structure networks [41], or residue interaction networks [42], although these networks may not be completely identical in terms of their construction (for reviews, see [39,40,41,43,44]). Owing to the appeal of network analysis in the era of post-genomics research, there has been an increase in the number of studies utilizing AANs to predict a protein’s functional sites [45,46,47], protein-protein [48,49,50,51] and protein-nucleic acid interaction [52,53], and to probe protein dynamics [42,54,55], folding [56,57,58] and structure [59,60,61,62,63]. Of these studies using AANs, two reports by Pons et al. [37] and Chang et al. [38] on PPD are directly relevant to the present work.

In AANs, the protein structure is modeled by a three-dimensional geometric network, with the amino acid residues (usually the Cα or Cβ atoms) being represented as network nodes and their contacts as network edges to capture the interactions between amino acids within the same protein structure and/or between two interacting proteins. Pons et al. [37] showed that network parameters, such as closeness and betweenness, can be used to suggest protein-protein interaction regions, and that an energy term that models this information can be added to an energy-based scoring function to improve PPD predictions. Chang et al. [38] used two networks for a single protein structure, one formed by hydrophobic residues and the other by hydrophilic residues, and analyzed the two networks from the same complex (docking pose) separately; their results again demonstrated that network properties can be used to assist conventional scoring functions to distinguish between good and bad PPD decoys.

Unlike Chang et al., in developing NPPD, we constructed only a single network for a single protein structure, allowing both the hydrophobic (H) and hydrophilic (i.e., polar, P) residue nodes to coexist in the same network. We were then able to investigate not only the effects of dyadicity calculated from the hydrophobic-hydrophobic (HH) and polar-polar (PP) interactions, but also the effects of heterophilicity calculated from the hydrophobic-polar (HP) interactions on the scoring of PPD poses. Benchmark evaluations showed that, using network parameters alone in all three methods, NPPD performed better than the network-assisted PPD predictions reported by Pons et al. [37] and Chang et al. [38], and that NPPD also performed well compared to most energy-based scoring functions. In addition, further analysis revealed significant complementarity between NPPD and the other scoring functions evaluated, demonstrating the merit of using a combination of NPPD and other types of scoring functions to further improve PPD predictions.

2. Experimental Section

Figure 1 outlines the procedures used to develop NPPD. Briefly, the interface residues of a given complex (i.e., docking pose) of protein A and protein B were determined, yielding the H and P nodes for the construction of the AANs for A and B. Eight parameters for each of the two networks were computed and served as attributes for training and testing a Bayesian network model using a PPD benchmark dataset. Note that, during the training of the Bayesian model, the complex context of all the poses was removed and each AAN was treated independently, although, during the machine learning, those that came from a good pose were used as positive incidences and those from a bad pose as negative incidences. Using the Bayesian model thus derived, NPPD can then score any given pose by multiplying together the Bayesian probabilities of the two AANs. This has the advantage of quickly eliminating most of the bad poses since it takes just one bad AAN (i.e., a low Bayesian probability) to produce a bad product (pose) of two AANs. Note that, as illustrated in Figure 1, our AAN was constructed on one side of the interface and did not extend to include contacts from the other side, because including inter-protein contacts did not improve the results [64], possibly owing to the fact that the connections of an inter-protein network can change significantly even by minor changes in the configuration of the docking pose. Still, it may be warranted for future studies to find a way to use inter-protein contacts productively in the Bayesian model.

Procedures used to develop NPPD. (a) An example of an amino acid network and the network parameters used in this study for a docking pose; (b) Flowchart of the training and testing of a Bayesian network model of NPPD.

2.1. Docking Datasets, Poses, and Quality Measures

The 176 protein complexes used in this study were retrieved from a PPD benchmark dataset of known atomic structures of complex component proteins in both the bound (complex) and unbound (free) form [65]. For each of the 176 complexes, two sets of docking poses from the unbound form were used to evaluate the performance of NPPD and compare it with those of several other PPD scoring functions. One set contained the top 54,000 poses for each of 176 complexes generated by ZDOCK [66] and was downloaded from its website (http://zlab.umassmed.edu/zdock/decoys.shtml). The other set, kindly provided by the authors of a large-scale evaluation of 115 scoring functions [67], consisted of ~500 poses generated using SwarmDock [68] for each of a subset containing 118 complexes. The two sets came with their own quality measures for near-native poses, i.e., the so-called good poses; that used for the ZDOCK-generated set was an interface RMSD (IRMSD) < 2.5 Å, where IRMSD is the root mean square displacement of the interface residue’s Cα atoms from the experimentally determined structure of the bound complex and an interface residue is defined as one having at least one heavy (non-hydrogen) atom within 5 Å of any heavy atom in the second protein of the complex, while those used for the SwarmDock-generated set were three quality measures from the CAPRI criteria [2] for acceptable, medium, and high quality.

2.2. Amino Acid Networks and Network Parameters

As described above, two AANs were constructed from the interface residues of two interacting proteins locked in a docking pose. In this work, the 20 amino acids were divided into two classes according to Eisenberg et al. [69], the H class consisting of Gly, Ile, Leu, Val, Phe, Met, Trp, Cys, Tyr, and Ala, and the P class consisting of Lys, Thr, Ser, Gln, Asn, Glu, Asp, Arg, His, and Pro. Our AANs, thus, contained two types of nodes, H and P, and a network edge was established to connect any two nodes (residues) if any heavy atom in one of the residues was within 5.0 Å of any heavy atom in the other (Figure 1a).

For each AAN, we computed two dyadicity parameters, Dp-p and Dh-h, and one heterophilicity parameter, Hp-h, which, following the work of Park and Barabasi [70], are defined as: DppmppE(mpp),DhhmhhE(mhh), and HphmphE(mph) where mpp, mhh, and mph are, respectively, the number of P-P, H-H, and P-H edges in the AAN, and the three denominators are the respectively expected values of mpp, mhh, and mph, which can be computed as: E(mpp)=np(np1)2p, E(mhh)=nh(nh1)2p and E(mph)=npnhp where np is the number of P nodes, nh the number of H nodes, and p = 2M/N(N-1) (M and N are the total number of edges and nodes, respectively) is connectance, which represents the average probability that two nodes in a dyadic network are connected [71].

2.3. Bayesian Network

To infer whether two AANs would generate a near-correct docking pose, we employed the machine learning algorithm implemented in the Weka platform [72] to derive a Bayesian network model [73], which we then used to compute the probability for every AAN of being at the interface of a protein complex. We then computed the probability product of two AANs to give an estimate of the likelihood of the resulting docking pose being a good one (Figure 1b). The aforementioned 176 benchmark complexes and their 54,000 poses per complex generated by ZDOCK were used in a leave-one-out training and testing of the Bayesian model, i.e., each of the 176 complexes was, in turn, left out during training of the model on AANs randomly selected from poses of the remaining 175 complexes and was then used as a test case. As shown in Figure 1b, we randomly selected 27,000 AANs from good poses, irrespective of whether they came from the same complex or not, as positive incidences and an equal number of AANs from bad poses as negative incidences, and used the values of the 8 parameters of Dp-p, Dh-h, Hp-h, mpp, mhh, mph, np and nh of the AANs as attributes for training. The training set-derived Bayesian model was then used to score poses of the left-out complex as a test of the model.

3. Results and Discussion3.1. Performance of NPPD and IRAD

The TopN success rates obtained using poses created and ranked by ZDOCK [66] and IRAD [74], a state-of-the-art PPD scoring function, have often been used as yardsticks to evaluate PPD scoring functions [3,4,5]. Both ZDOCK and IRAD use a multitude of scoring terms, such as shape complementarity, interface atomic contact energy, and electrostatics, and IRAD also uses both atom-based and residue-based potentials [66,74]. As can be seen in Figure 2, using the 54,000 poses created by ZDOCK for each of the 176 benchmark complexes, the Bayesian probabilities of NPPD produced worse Top1 and Top10 success rates than either ZDOCK or IRAD, but, as N increased, the success rates increased faster for NPPD than for ZDOCK or IRAD, with NPPD outperforming the other two when N > 100.

TopN success rates for NPPD, ZDOCK, and IRAD on the benchmark dataset of the unbound docking poses of 176 protein complexes. IRMSD < 2.5 Å was used to determine good (near-correct) poses. The success rates of ZDOCK and IRAD were obtained from the ZDOCK website (http://zlab.umassmed.edu/zdock/perf_decoys.shtml).

Despite the low success rates of NPPD at a low N, it is interesting that, as shown in Table 1, many of the complexes that NPPD succeeded at predicting were different from those predicted by IRAD and vice versa. The complementarity between the two methods, measured as the ratio of the method-unique successes divided by all successes and expressed as a percentage, was especially significant at low N, being as high as 86% for the Top1 success rate (only 3 out of 22 complexes were successfully predicted by both methods).

biology-04-00282-t001_Table 1

Number of benchmark complexes successfully predicted by NPPD and/or IRAD at different TopN success rates.

SetTop1Top10Top100Top1000Top2000
NPPD (A)92865102110
IRAD (B)16436492102
Intersection (A∩B)315448095
Union (A∪B) = a225685114117
Unique to NPPD or IRAD (A⊖B) = b1941413422
Complementarity = b/a86%73%48%30%19%

⊖ (Symmetric difference): the set of elements in either of the sets and not in their intersection.

3.2. Comparison with Other Network-Based Methods

As mentioned in the Introduction, two other groups have used AANs to help score docking poses [37,38]. Table 2 compares our results with their reported success rates and shows that, using the same benchmark dataset and the same criterion for success hits, when the scoring was based on network parameters alone, NPPD produced a better Top1 and Top10 success rate: e.g., the values for the Top10 success rate was 18.5% using NPPD versus 10.6% in Pons et al. [37] for the 176 complexes of the benchmark and 25.6% using NPPD versus 23.2% in Chang et al. [38] for a subset of 43 complexes. However, it should be noted that different sampling algorithms (FTDOCK [16], RossettaDock [75], and ZDOCK [66]) were used to generate the same number of poses for evaluation, which may have contributed to the differences in success rates obtained. Several aspects of the use of AANs were also different: (i) as mentioned earlier, our AAN was different from that of Pons et al. [37], which represents all amino acids by just one type of network node, and from that of Chang et al. [38], which, although, like ours, has both H and P nodes, creates two separate AANs for the two different types of nodes; (ii) as also mentioned earlier, unlike these two other networks, our AAN did not include inter-protein contacts; (iii) whereas we used dyadicity and heterophilicity parameters for scoring, the other two studies used more conventional network parameters, such as degree and cluster coefficient [38] and closeness and betweenness [37]; (iv) NPPD was used to score docking poses by itself, whereas the network-based scoring functions of the other two studies are additional terms that can be added to an existing scoring function to give a better result [37,38] (Table 2), and, if these results also apply to our method, incorporating NPPD into existing scoring functions should achieve significantly higher success rates.

biology-04-00282-t002_Table 2

Conditions and Top1/Top10 success rates for NPPD and two other network-based scoring functions.

Conditions of docking poses176 Complexes43 Complexes
Pons et al. [37]NPPDChang et al. [38]NPPD
Generation of docking posesFTDock [16]ZDOCKRossettaDock 1.0 [75]ZDOCK
Number of poses generated10,0001000
Criterion for a success hitL-RMSD < 10 ÅL-RMSD < 5 Å
Top 1 success rate *5.0% (7.0%)8.0%2.3% (25.6%)11.6%
Top10 success rate *10.6% (29.8%)18.5%23.2% (53.4%)25.6%

* The values in parenthesis are success rates produced by combining the network parameters and the energy terms of the sampling method.

3.3. Performance of NPPD in a Comprehensive Evaluation of a Number of PPD Scoring Functions

Since many factors can affect the performance of PPD scoring functions, one example being the evaluation of docking poses produced by different sampling methods as mentioned above, it was important to evaluate NPPD further. Recently, a large-scale evaluation of 115 PPD scoring functions was reported [67], in which the authors ranked these scoring functions by comparing their Top1, Top10, and Top100 success rates on a set of docking poses produced by SwarmDock [68]. As shown in Figure 3a, using the same set of docking poses, the leave-one-out Bayesian model of NPPD produced TopN success rates comparable to those produced by the best performers of the 115 scoring functions evaluated (ranked 7th by Top10 success rate). Note that, with the exception of the 1st-ranked ZRANK2 method [12], an earlier version of IRAD, which perhaps stands out a little bit from the others, these 20 top performers were more or less equally good, as the absolute ranking depended on which success rate (Top1, 10, or 100) and which quality measure (acceptable, medium, or high) were used as the basis for ranking. Note also that, of these top performers, NPPD was the only one using network parameters (the scoring functions of Pons et al. [37] and Chang et al. [38] were not included in the 115 PPD scoring functions previously evaluated [67]).

Using the complementarity between two PPD scoring functions as defined in Table 1, i.e., the ratio of the number of complexes successfully predicted by either, but not both, of the two functions divided by the total number of successfully predicted complexes, the results, presented in Figure 3b, showed that the complementarity of NPPD with each of 16 other best performers was generally higher than the averaged complementarity exhibited by the other methods, especially in the case of the Top1 and Top10 success rates. Interestingly, although SPIDER [76], another AAN-based PPD scoring function, ranked only 38th of the 115 scoring functions evaluated [67], it is especially good at predicting complexes not detected by conventional scoring functions [67]. Unlike NPPD and the methods used by Pons et al. [37] and Chang et al. [38], SPIDER uses motifs of network structures, rather than network parameters, for scoring.

3.4. Some Limitations and Prospects

Without the ability to handle large conformational change induced by complex formation, PPD methods would perform badly for such complexes [2]. Indeed, both NPPD and IRAD failed to produce a Top100 success hit for those in the benchmark set with the largest unbound/bound IRMSDs, indicative of a significant change in conformation between the unbound and bound form of the complex (Figure 4). However, conformational change is not the only culprit for failures in PPD predictions. Figure 4 shows that if sampling could not produce a sufficient number, say 300, of positive (good) poses as defined by IRMSD < 2.5 Å (see Figure 1b) to score upon, the likelihood for either NPPD or IRAD to succeed was drastically decreased, even for complexes considered as “rigid” [65]. Further analysis indicated that some of these “rigid” complexes had a particularly small interface and hence might be difficult to sample and predict [77]. Since the best current scoring functions all performed similarly (Figure 3), we speculate that the same two factors, conformational change and insufficient sampling of good poses, also limit the success of other PPD methods. Note that while the sampling of good poses among different complexes was unbalanced, the distribution of the attributes used by NPPD was not (Figure 4), suggesting that sampling bias would not significantly affect training of the Bayesian model. While it is not entirely clear to us what gave rise to the apparently poor correlation between the number of good poses sampled and unbound/bound IRMSD as observed in Figure 4, it is notable that NPPD was better than IRAD for a few of those with the smallest unbound/bound IRMSDs and poor sampling, whereas IRAD did much better than NPPD for those ranked next in unbound/bound IRMSD (roughly between complex 1PPE and 2QFW in Figure 4), thereby contributing partly to the high complementarity between the two methods (Table 1). Taken all these results together, we can conclude that while it is still likely to significantly improve PPD performance by combining all the different scoring functions, the main barriers to overcome remain those arising from sampling and conformational change.

Benchmark results for NPPD and complementarity of NPPD and several best performing PPD scoring functions. (a) The 20 best performing PPD scoring functions ordered, from left to right, by increasing Top10 success rate. All data except those for NPPD were taken from [67]. Note that the Top1, Top10, and Top100 success rates for each method, shown, respectively, as the left, center, and right bar in each group, were computed using a set of unbound docking poses (~500 for each of 118 complexes) generated by SwarmDock [68], which was different from the set generated by ZDOCK used in Figure 2 and Table 1. The leave-one-out Bayesian model of NPPD was therefore derived using these SwarmDock poses, but otherwise using the same procedures described in Figure 1. The portions of success rates for high, medium, and acceptable quality poses are shown, respectively, in red, orange, and yellow, the criteria for the three quality measures being those used by CAPRI [2]; (b) Complementarity between NPPD and each of another 16 best performing PPD scoring functions. The blue, purple, and green bars indicate the complementarity, as defined in Table 1, computed based on, respectively, the Top1, Top10, or Top100 success rates. The horizontal blue, purple, and green lines are the averaged complementarity for, respectively, theTop1, Top10, or Top100 success rates for all pairs of the 16 scoring functions (three of the scoring functions (SIPPER, PYDOCK_TOT, and PROPNSTS) of the 19 compared in (a) were not included because the data were not made available to us). References for these 19 PPD scoring functions can be found in Reference [67] and references therein.

In this work, instead of using two-fold validation as did Chang et al. [38], we opted for the leave-one-out validation of machine learning so that every complex of the benchmark set can be a test and the performance of NPPD can be fully compared with other scoring functions. Technical differences aside, machine learning techniques are known to be unreliable for extrapolation, and only methods based on first-principles physics can truly predict and would not fail miserably when encountering complexes with an unusual interface [78]. However, as such an ideal method is not yet in sight, there is room and merit to further develop empirical methods, such as NPPD, since a new method, particularly a nonconventional one, can often reveal shortfalls of existing methods.

Number of positive poses and Dp-p plotted against unbound/bound IRMSD. The 176 benchmark complexes of ZDOCK are ordered in increasing unbound/bound IRMSD, the best RMSD of interface residues superimposed between the unbound form and the bound form of the complex, with the PDB ID of every 5th complex indicated on the X-axis. Dashed line denotes a number of 300 positive poses. In the top half of the figure are the averages and standard deviations of the parameter Dp-p computed from the positive poses of each complex; all other attributes used by NPPD, and for negative poses, showed a similar random distribution [64].

4. Conclusions

In this work, we showed that a Bayesian model based on the dyadic parameters of AANs of docking poses performed well compared to the best scoring functions currently used for PPD predictions. Furthermore, the results showed that our method can complement other methods by finding good poses for a significant number of complexes missed by these methods. Taken together with the findings in a recent large-scale evaluation of 115 PPD scoring functions [67], these results suggest that non-conventional scoring functions, such as that developed in the present study, are worthy of further investigation in the effort to improve the prediction of protein complex structures.

Acknowledgments

We thank Fernández-Recio for providing the SwarmDock models. This work was supported by the Ministry of Science and Technology, Taiwan (grant nos. NSC97-2311-B-001-011-MY3 and NSC-97-2627-P-001-004). We thank Tom Barkas for English editing.

Author Contributions

Edward S.C. Shih and Ming-Jing Hwang conceived and designed the experiments, analyzed the data, and wrote the paper, while Edward S.C. Shih performed the experiments.

Conflicts of Interest

The authors declare no conflict of interest.

References MoscaR.PonsT.CeolA.ValenciaA.AloyP. Towards a detailed atlas of protein-protein interactions Curr. Opin. Struct. Biol. 2013 23 929 940 10.1016/j.sbi.2013.07.005 23896349 LensinkM.F.WodakS.J. Docking, scoring, and affinity prediction in CAPRI Proteins 2013 81 2082 2095 10.1002/prot.24428 24115211 MoalI.H.MorettiR.BakerD.Fernandez-RecioJ. Scoring functions for protein-protein interactions Curr. Opin. Struct. Biol. 2013 23 862 867 10.1016/j.sbi.2013.06.017 23871100 ShihE.S.C.HwangM.J. A critical assessment of information-guided protein-protein docking predictions Mol. Cell Proteomics 2013 12 679 686 10.1074/mcp.M112.020198 23242549 ShihE.S.C.HwangM.J. On the use of distance constraints in protein-protein docking computations Proteins Struct. Funct. Bioinform. 2012 80 194 205 10.1002/prot.23179 ViswanathS.RavikantD.V.ElberR. Improving ranking of models for protein complexes with side chain modeling and atomic potentials Proteins 2013 81 592 606 10.1002/prot.24214 23180599 PallaraC.Jimenez-GarciaB.Perez-CanoL.Romero-DuranaM.SolernouA.GrosdidierS.PonsC.MoalI.H.Fernandez-RecioJ. Expanding the frontiers of protein-protein modeling: From docking and scoring to binding affinity predictions and other challenges Proteins 2013 81 2192 2200 10.1002/prot.24387 23934865 PonsC.TalaveraD.de la CruzX.OrozcoM.Fernandez-RecioJ. Scoring by intermolecular pairwise propensities of exposed residues (SIPPER): A new efficient potential for protein-protein docking J. Chem. Inf. Model 2011 51 370 337 10.1021/ci100353e 21214199 MitraP.PalD. Using correlated parameters for improved ranking of protein-protein docking decoys J. Comput. Chem. 2011 32 787 796 10.1002/jcc.21657 20941737 TobiD. Designing coarse grained-and atom based-potentials for protein-protein docking BMC Struct. Biol. 2010 10 40 10.1186/1472-6807-10-40 21078143 Demir-KavukO.KrullF.ChaeM.H.KnappE.W. Predicting protein complex geometries with linear scoring functions Genome Inform. 2010 24 21 30 22081586 PierceB.WengZ. A combination of rescoring and refinement significantly improves protein docking performance Proteins 2008 72 270 279 10.1002/prot.21920 18214977 AndrusierN.NussinovR.WolfsonH.J. FireDock: Fast interaction refinement in molecular docking Proteins 2007 69 139 159 10.1002/prot.21495 17598144 ChengT.M.BlundellT.L.Fernandez-RecioJ. pyDock: Electrostatics and desolvation for effective scoring of rigid-body protein-protein docking Proteins 2007 68 503 515 10.1002/prot.21419 17444519 MurphyJ.GatchellD.W.PrasadJ.C.VajdaS. Combination of scoring functions improves discrimination in protein-protein docking Proteins 2003 53 840 854 10.1002/prot.10473 14635126 GabbH.A.JacksonR.M.SternbergM.J. Modelling protein docking using shape complementarity, electrostatics and biochemical information J. Mol. Biol. 1997 272 106 120 10.1006/jmbi.1997.1203 9299341 LiuS.VakserI.A. DECK: Distance and environment-dependent, coarse-grained, knowledge-based potentials for protein-protein docking BMC Bioinform. 2011 12 280 10.1186/1471-2105-12-280 LuH.LuL.SkolnickJ. Development of unified statistical potentials describing protein-protein interactions Biophys. J. 2003 84 1895 1901 10.1016/S0006-3495(03)74997-2 12609891 MiyazawaS.JerniganR.L. Self-consistent estimation of inter-residue protein contact energies based on an equilibrium mixture approximation of residues Proteins 1999 34 49 68 10.1002/(SICI)1097-0134(19990101)34:1<49::AID-PROT5>3.0.CO;2-L 10336383 OmoriS.KitaoA. CyClus: A fast, comprehensive cylindrical interface approximation clustering/reranking method for rigid-body protein-protein docking decoys Proteins 2013 81 1005 1016 10.1002/prot.24252 23344972 ChuangG.Y.KozakovD.BrenkeR.ComeauS.R.VajdaS. DARS (Decoys as the Reference State) potentials for protein-protein docking Biophys. J. 2008 95 4217 4227 10.1529/biophysj.108.135814 18676649 MullerW.StichtH. A protein-specifically adapted scoring function for the reranking of docking solutions Proteins 2007 67 98 111 10.1002/prot.21310 17243180 EsmaielbeikiR.NebelJ.C. Scoring docking conformations using predicted protein interfaces BMC Bioinform. 2014 15 171 10.1186/1471-2105-15-171 AnishchenkoI.KundrotasP.J.TuzikovA.V.VakserI.A. Protein models: The grand challenge of protein docking Proteins 2014 82 278 287 10.1002/prot.24385 23934791 KundrotasP.J.VakserI.A. Global and local structural similarity in protein-protein complexes: Implications for template-based docking Proteins 2013 81 2137 2142 10.1002/prot.24392 23946125 TorchalaM.MoalI.H.ChaleilR.A.AgiusR.BatesP.A. A Markov-chain model description of binding funnels to enhance the ranking of docked solutions Proteins 2013 81 2143 2149 10.1002/prot.24369 23900714 LondonN.Schueler-FurmanO. Funnel hunting in a rough terrain: Learning and discriminating native energy funnels Structure 2008 16 269 279 10.1016/j.str.2007.11.013 18275818 KozakovD.Schueler-FurmanO.VajdaS. Discrimination of near-native structures in protein-protein docking by testing the stability of local minima Proteins 2008 72 993 1004 10.1002/prot.21997 18300245 Schneidman-DuhovnyD.RossiA.Avila-SakarA.KimS.J.Velazquez-MurielJ.StropP.LiangH.KrukenbergK.A.LiaoM.KimH.M. A method for integrative structure determination of protein-protein complexes Bioinformatics 2012 28 3282 3289 10.1093/bioinformatics/bts628 23093611 De VriesS.J.BonvinA.M. CPORT: A consensus interface predictor and its performance in prediction-driven docking with HADDOCK PLOS ONE 2011 6 e17695 10.1371/journal.pone.0017695 21464987 GuS.KoehlP.HassJ.AmentaN. Surface-histogram: A new shape descriptor for protein-protein docking Proteins 2012 80 221 238 10.1002/prot.23192 22072544 ShentuZ.al HasanM.BystroffC.ZakiM.J. Context shapes: Efficient complementary shape matching for protein-protein docking Proteins 2008 70 1056 1073 10.1002/prot.21600 17847098 FinkF.HochreinJ.WolowskiV.MerklR.GronwaldW. PROCOS: Computational analysis of protein-protein complexes J. Comput. Chem. 2011 32 2575 2586 10.1002/jcc.21837 21630291 BourquardT.BernauerJ.AzeJ.PouponA. A collaborative filtering approach for protein-protein docking scoring functions PLOS ONE 2011 6 e18541 10.1371/journal.pone.0018541 21526112 ChaeM.H.KrullF.LorenzenS.KnappE.W. Predicting protein complex geometries with a neural network Proteins 2010 78 1026 1039 10.1002/prot.22626 19938153 AndreaniJ.FaureG.GueroisR. InterEvScore: A novel coarse-grained interface scoring function using a multi-body statistical potential coupled to evolution Bioinformatics 2013 29 1742 1749 10.1093/bioinformatics/btt260 23652426 PonsC.GlaserF.Fernandez-RecioJ. Prediction of protein-binding areas by small-world residue networks and application to docking BMC Bioinform. 2011 12 378 10.1186/1471-2105-12-378 ChangS.JiaoX.LiC.H.GongX.Q.ChenW.Z.WangC.X. Amino acid network and its scoring application in protein-protein docking Biophys. Chem. 2008 134 111 118 10.1016/j.bpc.2007.12.005 18329160 ZhangX.PericaT.TeichmannS.A. Evolution of protein structures and interactions from the perspective of residue contact networks Curr. Opin. Struct. Biol. 2013 23 954 963 10.1016/j.sbi.2013.07.004 23890840 Di PaolaL.de RuvoM.PaciP.SantoniD.GiulianiA. Protein contact networks: An emerging paradigm in chemistry Chem. Rev. 2013 113 1598 1613 10.1021/cr3002356 23186336 GreeneL.H. Protein structure networks Brief Funct. Genomics 2012 11 469 478 10.1093/bfgp/els039 23042823 GiolloM.MartinA.J.WalshI.FerrariC.TosattoS.C. NeEMO: A method using residue interaction networks to improve prediction of protein stability upon mutation BMC Genomics 2014 15 S7 10.1186/1471-2164-15-S4-S7 25057121 KrishnanA.ZbilutJ.P.TomitaM.GiulianiA. Proteins as networks: Usefulness of graph theory in protein science Curr. Protein Pept. Sci. 2008 9 28 38 10.2174/138920308783565705 18336321 YanW.ZhouJ.SunM.ChenJ.HuG.ShenB. The construction of an amino acid network for understanding protein structure and function Amino Acids 2014 46 1419 1439 10.1007/s00726-014-1710-6 24623120 PengW.WangJ.ChenL.ZhongJ.ZhangZ.PanY. Predicting Protein Functions by using unbalanced bi-random walk algorithm on protein-protein interaction network and functional interrelationship network Curr. Protein Pept. Sci. 2014 15 529 539 10.2174/1389203715666140724085224 25059324 AxeJ.M.YezdimerE.M.O’RourkeK.F.KerstetterN.E.YouW.ChangC.E.BoehrD.D. Amino acid networks in a (beta/alpha)(8) barrel enzyme change during catalytic turnover J. Am. Chem. Soc. 2014 136 6818 6821 10.1021/ja501602t 24766576 LeeB.C.ParkK.KimD. Analysis of the residue-residue coevolution network and the functionally important residues in proteins Proteins 2008 72 863 872 10.1002/prot.21972 18275083 LuoQ.HamerR.ReinertG.DeaneC.M. Local network patterns in protein-protein interfaces PLOS ONE 2013 8 e57031 10.1371/journal.pone.0057031 23520460 JohnsonM.E.HummerG. Interface-resolved network of protein-protein interactions PLOS Comput. Biol. 2013 9 e1003065 10.1371/journal.pcbi.1003065 23696724 GoebelsF.FrishmanD. Prediction of protein interaction types based on sequence and network features BMC Syst. Biol. 2013 7 S5 10.1186/1752-0509-7-S6-S5 24564924 Del SolA.O’MearaP. Small-world network approach to identify key residues in protein-protein interaction Proteins 2005 58 672 682 10.1002/prot.20348 15617065 MaetschkeS.R.YuanZ. Exploiting structural and topological information to improve prediction of RNA-protein binding sites BMC Bioinform. 2009 10 341 10.1186/1471-2105-10-341 SathyapriyaR.VijayabaskarM.S.VishveshwaraS. Insights into protein-DNA interactions through structure network analysis PLOS Comput. Biol. 2008 4 e1000170 10.1371/journal.pcbi.1000170 18773096 Montiel MolinaH.M.Millan-PachecoC.PastorN.del RioG. Computer-based screening of functional conformers of proteins PLOS Comput. Biol. 2008 4 e1000009 10.1371/journal.pcbi.1000009 18463705 BodeC.KovacsI.A.SzalayM.S.PalotaiR.KorcsmarosT.CsermelyP. Network analysis of protein dynamics FEBS Lett. 2007 581 2776 2782 10.1016/j.febslet.2007.05.021 17531981 LiJ.WangJ.WangW. Identifying folding nucleus based on residue contact networks of proteins Proteins 2008 71 1899 1907 10.1002/prot.21891 18175318 BaglerG.SinhaS. Assortative mixing in protein contact networks and protein folding kinetics Bioinformatics 2007 23 1760 1707 10.1093/bioinformatics/btm257 17519248 VendruscoloM.DokholyanN.V.PaciE.KarplusM. Small-world view of the amino acids that play a key role in protein folding Phys. Rev. E 2002 65 061910 10.1103/PhysRevE.65.061910 BhattacharyyaM.BhatC.R.VishveshwaraS. An automated approach to network features of protein structure ensembles Protein Sci. 2013 22 1399 1416 23934896 KhorS. Towards an integrated understanding of the structural characteristics of protein residue networks Theory Biosci. 2012 131 61 75 10.1007/s12064-011-0135-y 21948188 EstradaE. Universality in protein residue networks Biophys. J. 2010 98 890 900 10.1016/j.bpj.2009.11.017 20197043 BrindaK.V.VishveshwaraS. A network representation of protein structures: Implications for protein stability Biophys. J. 2005 89 4159 4170 10.1529/biophysj.105.064485 16150969 BaglerG.SinhaS. Network properties of protein structures Phys. A 2005 346 27 33 10.1016/j.physa.2004.08.046 ShihE.S.C.HwangM.-J.Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan Unpublished data 2015 HwangH.VrevenT.JaninJ.WengZ. Protein-protein docking benchmark version 4.0 Proteins 2010 78 3111 3114 10.1002/prot.22830 20806234 PierceB.G.HouraiY.WengZ.P. Accelerating protein docking in ZDOCK using an advanced 3D convolution library PLOS ONE 2011 6 e24657 10.1371/journal.pone.0024657 21949741 MoalI.H.TorchalaM.BatesP.A.Fernandez-RecioJ. The scoring of poses in protein-protein docking: Current capabilities and future directions BMC Bioinform. 2013 14 286 10.1186/1471-2105-14-286 TorchalaM.BatesP.A. Predicting the structure of protein-protein complexes using the SwarmDock Web Server Methods Mol. Biol. 2014 1137 181 197 24573482 EisenbergD.WeissR.M.TerwilligerT.C.WilcoxW. Hydrophobic Moments and Protein-Structure Faraday Symp. Chem. S 1982 17 109 120 10.1039/fs9821700109 ParkJ.BarabasiA.L. Distribution of node characteristics in complex networks Proc. Natl. Acad. Sci. USA 2007 104 17916 17920 10.1073/pnas.0705081104 17989231 FienbergS.E.MeyerM.M.WassermanS.S. Statistical-Analysis of Multiple Sociometric Relations J. Am. Stat. Assoc. 1985 80 51 67 10.1080/01621459.1985.10477129 HallM.FrankE.HolmesG.PfahringerB.ReutemannP.WittenI.H. The WEKA data mining software: An update SIGKDD Explor. Newsl. 2009 11 10 18 10.1145/1656274.1656278 NeedhamC.J.BradfordJ.R.BulpittA.J.WestheadD.R. Inference in Bayesian networks Nat. Biotechnol. 2006 24 51 53 10.1038/nbt0106-51 16404397 VrevenT.HwangH.WengZ. Integrating atom-based and residue-based scoring functions for protein-protein docking Protein Sci. 2011 20 1576 1586 10.1002/pro.687 21739500 GrayJ.J.MoughonS.WangC.Schueler-FurmanO.KuhlmanB.RohlC.A.BakerD. Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations J. Mol. Biol. 2003 331 281 299 10.1016/S0022-2836(03)00670-3 12875852 KhashanR.ZhengW.TropshaA. Scoring protein interaction decoys using exposed residues (SPIDER): A novel multibody interaction scoring function based on frequent geometric patterns of interfacial residues Proteins 2012 80 2207 2217 10.1002/prot.24110 22581643 RitchieD.W.KozakovD.VajdaS. Accelerating and focusing protein-protein docking correlations using multi-dimensional rotational FFT generating functions Bioinformatics 2008 24 1865 1873 10.1093/bioinformatics/btn334 18591193 MoreiraI.S.MartinsJ.M.CoimbraJ.T.RamosM.J.FernandesP.A. A new scoring function for protein-protein docking that identifies native structures with unprecedented accuracy Phys. Chem. Chem. Phys. 2015 17 2378 2387 10.1039/C4CP04688A 25490550