J. Cent. South Univ. Technol. (2008) 15: 132-135
DOI: 10.1007/s11771-008-0026-7
Model of generic project risk element transmission theory based on data mining
LI Cun-bin(李存斌), WANG Jian-jun(王建军)
(School of Business Administration, North China Electric Power University, Beijing 102206, China)
Abstract: In order to construct the data mining frame for the generic project risk research, the basic definitions of the generic project risk element were given, and then a new model of the generic project risk element was presented with the definitions. From the model, data mining method was used to acquire the risk transmission matrix from the historical databases analysis. The quantitative calculation problem among the generic project risk elements was solved. This method deals with well the risk element transmission problems with limited states. And in order to get the limited states, fuzzy theory was used to discrete the historical data in historical databases. In an example, the controlling risk degree is chosen as P(Rs≥2) ≤0.1, it means that the probability of risk state which is not less than 2 in project is not more than 0.1, the risk element R3 is chosen to control the project, respectively. The result shows that three risk element transmission matrix can be acquired in 4 risk elements, and the frequency histogram and cumulative frequency histogram of each risk element are also given.
Key words: data mining; risk element; risk management; project management
1 Introduction
Risk is the uncertain cause of the loss[1], and the uncertain cause objectively exists in the nature, society and the economy fields, so it is difficult to actualize the project plan and analyze the project exactly. In order to form a scientific method system, risk management has been included in project management[2], it is called project risk management. In recent years, the methods on the project risk management are presented, such as Monte Carlo risk analysis[3-4], fuzzy AHP[5-7], risk event tree analysis[8-9], interval number[10]. From these literatures, the risk factor is one of the important factors in project management.
There are plenty of the researches on project risk, but most of them are qualitative research, and the quantitative research on risk is difficult in risk research, and less attention has been paid to calculate the final risk element probability distribution from each risk element probability distribution. LI[11] presented generic project risk element transmission theory. Based on the theory-inspired, if the sub risk factors are divided reasonably, there should be some quantitative mathematic relations between the overall risk element and its sub risk elements. In other words, the risk of the final project goal can be acquired by each risk transmission of the sub risk elements uncertainties. MA and LI[12] studied the generic project risk element transmission theory, however, they did not realize the transmission process in fact, only gave the quantitative analysis model of the theory.
Therefore, in this work, the preliminaries of the generic project risk element transmission theory were given; and a data mining frame was constructed based on the knowledge, the data mining method was used to acquire the risk transmission matrix from the historical databases analysis in order to solve the quantitative calculation. The controlled risk element was also given with the controlling risk degree.
2 Summary of risk element transmission theory
The main idea of the generic project risk element transmission theory is the relationship between the overall risk element and its sub risk elements. Therefore, the risk element can be considered as the sub risk factors influencing the final goal. When the generic project problems are seemed as that the project is composed of some limited states, and the final goal is affected by each state transmission, each state can be defined as the risk element. A simple risk elements transmission train is shown in Fig.1.
Fig.1 Scheme of transmission train for simple risk elements
The definitions of the risk element transmission are as follows.
Definition 1 Risk element is a state in the generic project, where R1 is the beginning risk element, is the transmission risk element, Rn is the ending risk element.
Definition 2 In any risk element, each element appears in some states , named state space of the risk element.
Definition 3 Risk element transmission process begins from the anterior risk element, crosses the transmission risk elements , finally achieves the final risk element Rn.
Definition 4 State Xi of risk element Ri is acquired from the anterior risk element, the probability P(i) is expressed by
(1)
where 0≤P(i)≤1.
Definition 5 State space transmission probability matrix between Ri and Ri+1 is defined as risk element transmission matrix Ai:
where aij expresses the transmission probability from the state Qi of Ri to the state Qj of Ri+1, and
The transmission between Ri and Ri+1 is
Ri+1=RiAi (2)
Therefore, the transmission between Ri and Rn is
Rn=R1A1A2…An-1 (3)
3 Risk element transmission model based on data mining
Based on the definitions above, for quantitative risk transmission theory the two problems must be solved:
1) dividing the whole project from beginning to end in order to decide the risk elements in the project;
2) acquireing the risk element transmission matrix between each risk element;
For the first problem, it can be solved by experiences and real circumstance analysis or WBS method in project management[13-14]. And the second problem will be solved by data mining method in the rest context.
3.1 Data mining frame of risk element transmission matrix
Data mining is a process that can automatic dig the hidden information, the data sets and the forms of the information are relations, concept, rule, pattern and so on[15]. With the aboard application of the database technology, there are plenty of data in the database, which contain lots of valuable hidden information. Therefore, how to dig the information is a hotspot in data mining research. In other words, data mining is the process that digs the hidden and unknown valuable knowledge and principles in plenty of historical data. Consulting the idea of data mining, the risk element transmission matrix can be constructed in the flame, as shown in Fig.2.
1) Acquire the risk element information that is divided by WBS or other methods, then input it into data warehouse for further using.
2) Pretreat the information, such as cleaning noise, filtration and so on.
3) Analyze the information by data mining to get the risk element transmission matrix.
Fig.2 Data mining frame of risk element transmission matrix
3.2 Data mining analysis of risk element transmission matrix
From Definition 3 and Definition 5, in order to get the risk element transmission matrix from the data warehouse, the following two steps are needed.
1) Divide the risk states in the warehouse rea- sonably.
2) Acquire the transmission probability between each state.
3.2.1 Division of risk states in warehouse
If the state space of Ri is X={Q1, Q2,…, Qn}, at first, a prototype B={B1, B2, …, Bn} is defined, where Bi is the range of Qi, and it is expressed by interval number (). The parameters and can be given by experience, knowledge or decision makers.
Because it is difficult to acquire the exact range of the interval number, consulting the idea of fuzzy data mining[16-17], the state interval number is enlarged to
by fuzzy method, where is the left adaptive index, is the right adaptive index, the membership function u(x) of and is shown in Fig.3.
Fig.3 Membership function of interval numbers
And the relationship between membership function u(x) and x can be written as follows:
(4)
For each risk element Ri, it can be acquired a membership vector by Eqn.(4) in its state space (u(Q1), u(Q2), …,u(Qn))T. Based on the maximum membership principle, the state space of the risk element can be gotten as
u(Ri)=max(u(Q1), u(Q2),…, u(Qn)) (5)
3.2.2 Acquisition of risk element transmission matrix based on data mining
If the state spaces of the risk elements are divided, the risk element transmission matrix can be acquired by analyzing the historical data in the data warehouse with following method:
for i=1 to n.
For each state of risk element Ri, calculate the frequency of each state of risk element Ri+1 as the transmission matrix element.
4 Example analysis
A working procedure of a project can be divided into 4 risk elements by WBS, which are (R1, R2, R3, R4), and the interval numbers of the risk element space is listed in Table 1. In Table 1, the risk state is expressed by 0, 1, 2, 3, from low to high, and the clear interval number and maximal interval number of each state is also given experts, and dividing the historical data by Eqn.(5), the divisions of risk element states can be got, as listed in Table 2.
The risk element transmission matrix can be acquired as follows. For example, when R1=3, all of R2 are equal to 3, in other words, none of 2, 1, 0 is in R2, so the first row of A1 is 1, 0, 0, 0.
(6)
(7)
(8)
Table 1 State spaces of risk elements
Table 2 Division of risk element states
Therefore, the project risk can be forecasted by using the risk element transmission matrix. If the beginning risk state probability is R1=(0, 0, 0.2, 0.8), which respectively expresses R1’s probability of risk state (0, 1, 2, 3) is (0, 0, 0, 0.2, 0.8), the risk of each risk element can be calculated by Eqn.(3) as follows:
The histogram of frequency is shown in Fig.4, and the histogram of cumulative frequency (the risk elements: R2, R3 and R4) is shown in Fig.5.
With the information, if the controlling risk degree is P(Rs≥2)≤0.1(Rs is the risk state, P is the probability), then R3 is chosen to control the project risk.
Fig.4 Histogram of each risk element state space: (a) R2; (b) R3; (c) R4
Fig.5 Histogram of cumulative frequency: (a) R2; (b) R3; (c) R4
5 Conclusions
1) The generic project risk element definitions are given, and the fuzzy data mining model is used to actualize the quantitative model of the theory in order to estimate the project risk.
2) The risk element transmission matrix can be acquired, and the frequency histogram and cumulative frequency histogram of each risk element are also given. If the controlling risk degree is chosen as P(Rs≥2) ≤0.1, it means that the probability of risk state which is not less than 2 in project is not more than 0.1, the risk element R3 is chosen to control the project, respectively.
References
[1] WANG He-cheng, TAO Li-yi, ZHANG Yuan-fu. Research and management on project risk[J]. Chinese Journal of Management Science, 1998, 6(4): 15-21. (in Chinese)
[2] PERRY J G. Risk management—An approach for project managers[J]. International Journal of Project Management, 1987, 4(4): 211-126.
[3] KRAUPL S, WIECKERT C. Economic evaluation of the solar carbothermic reduction of ZnO by using a single sensitivity analysis and a Monte-Carlo risk analysis[J]. Energy, 2007, 32(7): 1134-1147.
[4] AU S K, WANG Z H, LO S M. Compartment fire risk analysis by advanced Monte Carlo simulation[J]. Engineering Structures, 2007, 29(9): 2381-2390.
[5] CHAN F T S, KUMAR N. Global supplier development considering risk factors using fuzzy extended AHP-based approach[J].Omega, International Journal of Management Science, 2007, 35(4): 417-431.
[6] LI C B, WANG J J. The research of hierarchical risk element transmission theory based on fuzzy extended AHP[C]// International Conference on Management. Shanghai: 2007, 129-133.
[7] CHANG D Y. Applications of the extent analysis method on fuzzy AHP[J]. European Journal of Operational Research, 1996, 95(3): 649-655.
[8] HENRY E J. Reliability engineering and risk analysis[M]. LU Ying-zhong. Beijing: Atomic Energy Press, 1988. (in Chinese)
[9] WU Chao. Fault tree analysis of spontaneous combustion of sulphide ores and its risk assessment[J]. Journal of Central South University of Technology, 1995, 2(2): 77-80.
[10] XU Ze-shui, SUN Zai-dong. Priority method for a kind of multi-attribute decision-making problems[J]. Journal of Management Sciences in China, 2002, 5(3): 35-39. (in Chinese)
[11] LI Cun-bin, QU Bin, WANG Ke-cheng. The research on three-dimensional structure model of Generalized project risk element transmission[C]//Proceedings of CRIOCM 2006 International Research Symposium on Advancement of Construction Management and Real Estate. Beijing: Chinese Research Institute of Construction Management, 2006: 285-290.
[12] MA T T, LI C B. The applications of risk analysis in the comprehensive evaluation of construction projects[C]// Proceedings of CRIOCM 2006 International Research Symposium on Advancement of Construction Management and Real Estate. Beijing: Chinese Research Institute of Construction Management, 2006: 448-453.
[13] GLOBERSON S. Impact of Various Work-breakdown Structures on Project Conceptualization[J]. International Journal of Project Management, 1994, 12(3): 165-171.
[14] Project Management Institute Standards Committee. A guild to the project management body of knowledge[M]. New York: Project Management Institute Press, 2000.
[15] CHEN Xiao-hong, LAI Bang-chuan, LUO Ding. Mining association rule efficiently based on data warehouse[J]. Journal of Central South University of Technology, 2003, 10(4): 375-380.
[16] BELACEL N. Multicriteria assignment method PROAFTN: methodology and medical applications[J]. European Journal of Operational Research, 2000, 1(16): 175–183.
[17] BELACEL N, RAVAL H B, PUNNEN A P. Learning multicriteria fuzzy classification method PROAFTN from data[J]. Computers and Operations Research, 2007, 34(7): 1885-1898.
(Edited by YANG Hua)
Foundation item: Project(70572090) supported by the National Natural Science Foundation of China
Received date: 2007-09-15; Accepted date: 2007-10-25
Corresponding author: LI Cun-bin, Professor; Tel: +86-10-51963787; E-mail: lcb999@263.net