Cancer Epidemiology, Combinatorial Drug Therapy, Personalized Medicine, Network biology


Cancer has become one of the biggest health threats all over the world. It is a leading cause of death worldwide. In 2008 cancer cause 7.6 million deaths worldwide, which are increased to 8.2 million deaths and 32.6 million people living with cancer (within 5 years of diagnosis) in 2012. Deaths from cancer worldwide are projected to continue rising, with an estimated 17 million deaths in 2030, according to the International Agency for Research on Cancer (Ferlay et al., 2010, Ferlay J, 2012, Jemal et al., 2011, Lyon, 2013). According to Lancet Oncology Commission report titled “Delivering Affordable Cancer Care in High-Income Countries” published Sept. 26. U.S. cancer spending grew to $90 billion in 2008 from $27 billion in 1990 and is projected to soar to $157 billion by 2020(Keogh, 2012).Surgery, radiation and chemotherapy are the standard methods for the treatment of cancer.These therapy shows success to a varying extent; their success is a function of benefit and adverse effects associated with them. They shows a great extent of success to give relief from symptoms and enhancing the survival time of patient; but it is also associated with severe side effect, and even the development of secondary cancers due the non-specific nature of these therapeutic methods (Breitkreutz et al., 2012, Brenner, 2002, Ellison and Gibbons, 2006, Jemal et al., 2011, Jemal et al., 2009). That’s why with few notable exception, mortality rates for major classes of cancer does not shows notable change over the last few decades. It shifted the emphasis toward the molecularly targeted therapy, which is a growing part of many cancer treatment regimens. Molecularly targeted therapy a newer type of cancer treatment that uses drugs or other substances to more precisely inhibit crucial cancer signaling pathways to selectively kill the cancerous cells and minimal damage to normal cells; for example small-molecule drugs like Gleevec® (imatinib mesylate) a tyrosine kinase inhibitor, Iressa® (gefitinib) epidermal growth factor receptor (EGFR) inhibitor, Sutent® (sunitinib) (VEGF) receptor inhibitor etc. other class of substances includes in targeted therapy are antibody drugs, like Avastin® (bevacizumab), a VEGF-blocking antibogy, Erbitux® (Cetuximab) is designed to seek out and lock onto EGFR to facilitate the killing of cancer cells (Bayraktar and Rocha-Lima, 2013, Huynh, 2010, Sathornsumetee et al., 2007). Although molecularly targeted drugs for cancer therapy does not affect the normal functions body up to a extent of standard chemotherapy, but they still cause some side effects; for example angiogenesis inhibitors like Nexavar® (sorafenib), Sutent® (sunitinib), Votrient® (pazopanib) , etc. interfere with the formation of new blood vessels. This can lead to problems with bruising and bleeding of blood vessels (Breitkreutz et al., 2012, Elice and Rodeghiero, 2010, 2012). Cancer is a systems biology disease. The etiology of cancer involves a complex interplay of cancer-signaling pathways; that translate the external stimulus—such as hormonal signals, growth factors, or micro environmental stress—into appropriate biological responses, such as cell growth, proliferation, differentiation, or apoptosis. There is ample evidence that most or all cancers display dysregulation of several signaling pathways resulting in the cells acquired independence of external growth factors (Baker and Kramer, 2011, Laubenbacher et al., 2009). Therefore, understanding the cancer signaling pathway during carcinogenesis, cancer progression, and it’s response to therapy (Radiation, immunotherapy and chemotherapy) is crucial for development of more effective therapeutic methods (Breitkreutz et al., 2012, Faratian et al., 2009, Hornberg et al., 2006, van der Greef et al., 2007). The present study investigates the correlation between cancer-signaling network complexity and cancer epidemiological data sets. Molecular pathways for a number of cancer sites were examined and network metrics computed betweenness centrality and clustering coefficient. The results of study revealed that the cluster coefficient metric, which represents the network complexity, is correlated with cancer epidemiological data sets. Cancer networks with higher network complexity were also associated with a high risk of cancer.

Material And Methods

Dataset collection

For some types of cancer, the detailed molecular pathway has been resolved. The pathway data set of these cancers are available at many public, freely available pathway database such as BioCyc (, Reactome (, BioGRID (, BioCarta (, Kyoto Encyclopedia of Genes and Genomes (KEGG) (, human signalling network (, Pathway Interaction Database ( and Biological-Networks ( Pathways of these databases are literature based creative, computationally predicted metabolic pathways or from other databases like NCBI, Ensembl, UniProt databases etc. Cancer sites whose detail molecular pathway networks have been worked out and their datasets are present at public freely available pathway database are selected for this study (Table-1).

Cancer statistics has been accessed for cancer Incidence, Mortality and Lifetime Risk from the Surveillance Epidemiology and End Results (SEER) Program database (, which is a resource for epidemiological data compiled by the National Cancer Institute as a service to researchers and physicians. Only the cancer Incidence, Mortality and Lifetime Risk statistics for the cancer sites selected for this study were used (Table-1).

Network Assessment and Analysis

Network assessment and network metrics analysis was done by software cytoscape 3.0.1 . Clustering coefficient (measure of complexity) and betweenness centrality (a measure of the extent that a node lays on the paths between other nodes) are statistical parameters used in correlation analysis with epidemiological data sets.

The betweenness centrality of a node ʋ is given by the expression:

g(v) = ∑s≠v≠t σst(v)/σst

Where σst is the total number of shortest paths from node s to node t and σst(ʋ) is the number of those paths that pass through ʋ (Junker and SCHREIBER, 2007, Koschutzki, 2008).

Clustering coefficient is a local property that quantifies the likelihood that the neighboring nodes of a given node i are interconnected.

It is determined as Ci = 2n/[ki(ki – 1)],

Where, ki the node degree of the node I (Koschutzki, 2008, Medina, 2013).

Statistical Analysis

The data sets of cancer epidemiology, network statistic and literature searches were analyzed using a correlation test and statistical significance was defined as p˂0.05.

Table 1. Cancer Epidemiological data sets (Cancer incidence, Death rate in cancer and life time risk of cancer) and network statistic (cluster coefficient ) for each of the 10 cancer sites in this study. *Cancer incidence per 100,000 men and women per year; **Death per 100,000 men and women per year; ***Life time Risk is the probability of developing cancer in the course of one’s lifespan. Lifetime risk may also be discussed in terms of the probability of developing or of dying from cancer. Based on cancer rates from 2008 to 2010.

Type of cancer Cancer incidence* Death rate** Life time Risk*** Cluster Coefficient
Lung cancer 61.4 49.5 6.88 0.274
Colorectal cancer 45 16.4 4.82 0.222
Endometrial cancer 24.3 4.3 2.69 0.2
Skin cancer 23.1 3.6 NA 0.261
Bladder cancer 20.7 4.4 2.4 0.199
Renal cell carcinoma 15.3 4 1.61 0.195
Pancreatic cancer 12.2 10.9 1.49 0.215
Glioma 6.5 4.3 0.62 0.189
Melanoma 5.9 3.4 0.7 0.191
AML 3.7 2.8 0.41 0.144

NA- Data not available.

Table-2. Correlation analysis between Network statistic (cluster coefficient of cancer network pathway) and Cancer Epidemiological data sets (Cancer incidence, Death rate in cancer and life time risk of cancer).

Correlation Between Correlation coefficient (r) R squared P value Is the correlation significant?
Cancer incidence Cluster Cofficient 0.7661 0.5869 0.0098 Yes
Death rate Cluster Cofficient 0.6662 0.4439 0.0354 Yes
Life time Risk Cluster Cofficient 0.8757 0.7668 0.002 Yes


Table-3. Search terms corresponding to each of the cancer sites in the study were used to perform searches for on PubMed (, Google scholar (‎) and Web of Knowledge (Thomson Reuters) ( using the default (nonadvanced) search type. The total number of citation results returned by each search was recorded. The searches were performed on July 22, 2013.

Type of cancer Number of citations
PubMed Google scholar Web of Knowledge (Thomson Reuters)
Lung cancer 56,497 18,20,000 1,00,508
Colorectal cancer 156,509 16,80,000 1,80,404
Endometrial cancer 26,062 9,22,000 36,334
Skin cancer 21,003 916,000 40,390
Bladder cancer 58,066 15,40,000 79,641
Renal cell carcinoma 32,603 17,50,000 73,254
Pancreatic cancer 66,365 13,80,000 77,235
Glioma 62,883 4,50,000 75,633
Melanoma 90,922 11,30,000 1,95,822
AML 57,488 4,66,000 89,308

1 jpg

Figure 1. Scatter plot showing the correlation between cancer incidence rate and cluster coefficient Data points are shown for cancer sites in the study. The x axis is the incidence rate for the cancer site and the y axis is the cluster coefficient for the cancer site. The line is a linear regression fit, with Correlation coefficient (r) = 0.7661.


Figure 2. Scatter plot showing the correlation between death rate for cancer and cluster coefficient. Data points are shown for cancer sites in the study. The x axis is the death rate for the cancer site and the y axis is the cluster coefficient for the cancer site. The line is a linear regression fit, with Correlation coefficient (r) = 0.6662.


Figure 3. Scatter plot showing the correlation between lifetime risk of cancer and cluster coefficient. Data points are shown for cancer sites in the study. The x axis is the lifetime risk for the cancer site and the y axis is the cluster coefficient for the cancer site. The line is a linear regression fit, with Correlation coefficient (r) = 0.8757.


The objective of the study is to correlate statistical metrics of network to Cancer Epidemiology. The main finding of the study is that, clustering coefficient is directly correlated to Cancer incidence, Death rate and lifetime risk of cancer. Results of study also shows that the network with highest clustering coefficient are associated with higher Cancer incidence, Death rate and lifetime risk of cancer; as in the case of Small cell lung cancer, which have a highest clustering coefficient also have highest cancer incidence, Death rate and lifetime risk of cancer (table-1).

Correlation coefficients (r) and P value for correlation between epidemiological parameters and clustering coefficients ware calculated (table-2); the plot of cancer incidence versus clustering coefficient shows that only one point lies outside the 95% confidence band (figure-1). The plot of death rate versus clustering coefficient shows that only two points lies outside the 95% confidence (figure-2) band but in a plot of lifetime Risk versus clustering coefficient, all points lies within the 95% confidence band (figure-3). All these statistical results show that extent of correlation is good and correlations are not due to random sampling.

Death rate of cancer patients is influenced by the standard of treatment and economic status of patients. On other hand cancer incidence and life time risk has least affected with these factors. Data of study also shows representing the same results. In this study cancer incidence and life time risk have higher correlation with network complexity in comparison to death rate.

Cancer metabolic pathways present in pathway database are literature based, so there is a possibility that extent to which a particular type of cancer is studied may be influencing the topology, and hence the complexity of the networks. To test this possibility, a set of PubMed, Google scholar and Web of Knowledge (Thomson Reuters) literature searches corresponding to each of the cancer site of study were performed and the total numbers of citations were compared with the clustering coefficient values (Table-3). No correlation between the total citations and the clustering coefficient was observed, which supports the statement that the extent to which a particular type of cancer has been studied is not biasing the results.


Topological characteristics provide crucial knowledge for identification of critical network components, which may be an attractive druggable target, because they cause the inhibition of nonessential genes while the consequent disruption of information flow between functional modules may prove to be therapeutically effective with minor toxic effects(Albert et al., 2000, Hwang et al., 2008, Palumbo et al., 2005, Zhong et al., 2009). Betweenness centrality indicates the relative importance of that particular node in network global connectivity (Abbasi et al., 2012). It can be helpful in discovery of new druggable targets.

The KRAS and GBR2 is the frequently occurring node and same results are also reported by Breitkreutz et al 2012(Breitkreutz et al., 2012). KRAS is also known as V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog or K-ras, a Ras family oncogene. About 30% of human tumours carry ras gene mutations. Of the three genes in this family (composed of K-ras, N-ras and H-ras), K-ras is the most frequently mutated member in adenocarcinomas of the lung (Hensing et al., 2014, Johnson et al., 2001, Mainardi et al., 2014). Therefore, alteration in KRAS activity will have significant effects on information processing in the protein–protein interaction.

Grb2 is a ubiquitously expressed adapter protein that is essential for a variety of basic cellular functions and acts as a critical downstream intermediary in several oncogenic signaling pathways(Giubellino et al., 2008). It found to be over expressed during carcinogenesis and enhanced signaling through MAPK pathway (Gui et al., 2012, Saxton et al., 2001). Grb2 is involved in keratinocyte growth factor (KGF) induced motility in cancer cells lines, suggesting that Grb2 can be a valid therapeutic target for cancer therapy for prevention of local invasion and metastasis of solid tumors (Zang et al., 2004).

Hence the proteins sorted from the molecular pathway network of cancer on the basis of betweenness centrality can be a potent target for drugs. These targets provide a molecular basis for chemoprevention.

Breitkreutz et al 2012 show that cancer molecular signalling network complexity is correlated with 5-year survival probability of patient (Breitkreutz et al., 2012). 5-year survival probability greatly influenced by the economic status, slandered of treatment and medical advancement in cancer therapy; but on the other hand cancer incidence and lifetime risk of cancer influenced least(2014). Therefore in present study we use incidence and lifetime risk of cancer for statistical study. Results of this study also indicate that these epidemiological parameters are strongly correlated with network complexity.

A complex interplay of various signalling pathways is responsible for carcinogenesis and cancer progression, which limit the efficacy of a single drug to provide a desired therapeutic result. As of now, inability of single drug to produce most effective results in cancer treatment enhances the future prospective of combinatorial targeted chemotherapy (Mishra et al., 2009, Schmidt et al., 2007, Wagner and Ulrich-Merzenich, 2009).

Systems biology has potential to play a crucial role in prediction of the most potent therapeutic targets; which consequently helps in identification of the best possible drug combination (Al-Lazikani et al., 2012, Chabner and Roberts, 2005, Chen et al., 2012, DeVita et al., 1975). combinatorial drugs are multi-composed mixtures of active components; they show their synergistic effect by acting at same or different nodes of a cancer signalling network resulting in increase of therapeutic potential many folds, in comparison to a single drug-target therapy and, also compensates the toxicity and increased bioavailability of active compounds (Arrell and Terzic, 2010, Huie, 2002, Khan, 2006, Pawson and Linding, 2008, Schoeberl et al., 2009, Tsai et al., 2009). Ability to target the multiple nodes of cancer signalling network may restrict the cancerous cells to develop the resistance against combinatorial drug therapies (Al-Lazikani et al., 2012, Woodcock et al., 2011). This method is also helpful in the development of combinatorial drug therapies to treat other disease like diabetes, tuberculosis, Alzheimer’s and cardiovascular disease (Espinal et al., 2000, Hammer et al., 2006, Perry et al., 2009, Shi et al., 2012).

Systems biology aims to describe and to understand the operation of complex biological systems. Systematic approach plays a crucial role in accommodation of human complexity variability and its influence on health and disease (Hood et al., 2004, Kitano, 2002, Naylor and Chen, 2010). It provides a new opportunity to find out treatment for each individual patients beyond the ‘one-size-fits-all’ treatment strategy (Arrell and Terzic, 2010, Zhang et al., 2012). Advancements of tools and technologies in systems biology analyses provide great opportunities to exploit the emerging areas of personalized medicine (Naylor and Chen, 2010). Application of systems biology in personalized medicine for the treatment of various diseases including cancer provides a great opportunity to find out most effective diagnosis and treatment for individual patients. It is also helpful to make the therapeutic methods more cost effective (Chuang et al., 2007, Lemberger, 2007, Morel et al., 2004).

Understanding the differential behaviour of regulatory networks during health, disease and in response to drugs play a crucial role to enhance drug development efforts, new target identification, delineation of off-target effects, methods of disease prediction, combinatorial drug therapy and also in development of molecularly targeted personalized treatment (Arrell et al., 2009, Bartunek et al., 2009, Nelson et al., 2008, Weston and Hood, 2004). It could facilitate the drug discovery by generating a testable hypothesis directly from network components and structure. They also provide a template for systems modelling of drug discovery, by integrating information from multiple levels of biological complexity, as well as from drug–protein or drug–ligand interaction networks(Arrell and Terzic, 2010).


By the result analysis, it has found a valid support that there is a correlation between complexity (clustering coefficient) of cancer network pathway and cancer epidemiological data sets (Cancer incidence, Death rate and lifetime risk of cancer). Finding also supports the initial assumption that the complexity of network matrices is a direct indicator of cancer threat. Understanding the complex behaviour of cancer-signalling networks is very helpful for providing a platform for the development of rational therapeutic strategies for cancer and other deadly diseases.

Conflict Of Interest

The authors declare that there is no conflict of interests regarding the publication of this paper.