IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 608-615

International Journal of Research in Information Technology (IJRIT)


ISSN 2001-5569

Prediction of Software Defects Based on Artificial Neural Network Approach

Punam Bajaj

Rajeev Sharma

Raminder Kaur

Assistant Professor Department of CSE Chandigarh Engineering College Mohali, Punjab, India [email protected]

Assistant Professor Department of CSE Chandigarh Engineering College Mohali, Punjab, India [email protected]

Research Scholar Department of CSE Chandigarh Engineering College Mohali, Punjab, India [email protected]

Abstract—Due to high complexity and constraints involved in the software development process, it is difficult to develop and produce software without faults. Defective Software poses considerable risk by increasing the development and maintenance costs and decreasing customer satisfaction. Therefore, detecting and fixing the bugs of the software is very much important at an early stage of software development life cycle. Identifying faults early helps to predict the need for quality checking, monitoring and amount of testing required. In this paper, we have studied the neural network based software defect prediction model. These software fault prediction models are used to identify the fault prone software modules and produce reliable software. Bacterial Foraging Optimization (BFO) algorithm has been used and applied for learning process to select the best architecture of the neural network. Keywords: BFOA, Data Mining, Neural Network, Software Defect Prediction, Software Metrics, Swarm Intelligence Optimization

I. INTRODUCTION A software defect is a flaw in the software that causes it to perform unexpectedly and produce an incorrect output. As the use of software is increasing, the failures are also increasing rapidly. Defective software modules can cause software failures, increase development, maintenance and testing cost and decrease customer satisfaction. The consequences of failures may even lead to loss of life or economic loss. For Example, a software defect caused the Therac-25 radiation therapy device behaves unexpectedly for certain input sequence resulting in the delivery of lethal radiation doses leading to some patient’s death in 1980. Similarly, the destruction of prototype Ariane 5 rocket in less than a minute after its launch was due to a bug in the onboard guidance computer program resulting in loss of US $1 billion [15]. Therefore, defect prediction is extremely essential in the field of software quality and reliability. It is beneficial to predict the system reliability or fault proneness as early as possible to help making decisions on testing, code inspection, design rework. An additional benefit of early prediction of software faults is better resource planning and test planning. In software development life cycle, a fault can be introduced at any stage. Detection of defects becomes a

Punam Bajaj,IJRIT


IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 608-615

challenging problem with the expansion of software size and complexity. According to the current and future complexity of software system, we can say that effective quality assurance activities are required to improve the quality of software. Software fault prediction is one of the quality assurance activities in software quality engineering like formal verification, fault prevention, fault tolerance, inspection and testing. Software metrics may be used in fault prediction model to improve software quality by predicting fault location. Metrics play important role in this regard as they are numerical measures that can be obtained early in SDLC. Software metrics (Object oriented metrics, process metrics or traditional source code metrics) belonging to a previous software versions are used to build the software defect prediction model. Defect Prediction models have independent variables in the in the form of metrics and one dependent variable which indicates the fault proneness of the module. A variety of modeling techniques have been developed and applied for software quality prediction. These include machine learning methods, logistic regression, discriminant analysis, fuzzy classification, Bayesian belief networks, decision trees, neural networks. II. DATA MINING TECHNIQUES Data Mining entails the overall process of extracting knowledge from large amount of data. Data Mining is often described as the process of discovering patterns, correlations, trends or relationships by searching through a large amount of data stored in repositories, corporate databases and data warehouses. Data mining techniques are used to explore, analyze and extract data using complex algorithms in order to discover unknown hidden patterns. Data Mining extracts useful knowledge and information and transforms it into an understandable structure for further use. Data Mining includes forecasting what may happen in future, classifying things into groups by recognizing patterns, clustering things into groups based on their attributes and associating what events are likely to occur together. Data mining activities can be applied to data generated in every stage of software development life cycle such as design, development, testing and integration, implementation, deployment and maintenance and find potential defects in the software. Researchers have been applying different data mining techniques such as Bayesian networks, neural networks, decision tree, bagging, fuzzy logic and support vector machines for prediction of software faults. One possible solution to software fault detection is to employ the neural network, since it can build a model adaptively from the given data set of failure processes. Many Researchers have been successfully adapted neural networks for software reliability issues. III. OVERVIEW OF ARTIFICIAL TECHNIQUES USED A. Artificial Neural Network An artificial neural network is a computing system made up of a number of simple, highly interconnected processing elements that is inspired by the way a biological nervous system in human brain works. These neurons process information by their dynamic state response to external inputs and work in union to solve complex problems. Neural network is based on a machine learning approach. Machine Learning classification is a popular approach for classifying software code attributes into defective and non defective which is completed by means of a classification model derived from software metrics data of previous development projects. Neural Networks models have significant advantage over analytical models because they require only failure history as input, no assumptions. Using that input, neural network model automatically develops its own internal model of failure process and predicts future failure. Neural Networks are now being applied to an increasing number of real world problems. They are learning mechanisms that can approximate any non linear continuous functions based on the given data. Their primary feature is that they can be applied to solve problems that do not have algorithmic solution; problems that are too complex for conventional technologies. Neural networks are well suited to problems that people are good at solving, but for which computers generally are not. These problems include pattern recognition and forecasting – which requires the recognition of trends in data. Neural Networks has gained immense popularity due to its adaptability to the

Punam Bajaj,IJRIT


IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 608-615

problem at hand by training with known data. As ANN is capable of modeling complex functions, it can be used as a predictive model. The Advantages and features of ANN are: • It can approximate fully any complicated non linear relation. • It uses the parallel distributed processing method which makes the quick computation possible. • All the qualitative information is stored in each neuron of network as potential distribution, so it has good robustness and fault tolerance. Neural Network is a collection of fast processing and computing nodes called neurons. It is structured by a large number of nodes and the connections along them. Each neuron can receive signal, process the signals and finally produce an output signal. Figure 1 depicts a neuron, where f is activation function that processes the input signals and produces an output of the neuron, A are the outputs of the neurons in the previous layer, and w are the weights connected to the neurons of the previous layer.

Figure 1: A Neuron

Artificial Neural Networks consists of three layers: input layer, hidden layer and output layer. 1) 2) 3)

Input Layer: This Layer comprises of input units which symbolizes the unrefined information provided for the networks. Hidden Layer: This Layer is represented by hidden units which are influenced by the behaviors of the input units and the weight that connect these input and the hidden units. Output Layer: The output unit’s behavior is dependent on the specificity of the hidden units and the weights connecting the hidden and output units.

Connection between input units and hidden and output units are based on relevance of the assigned value i.e. weight of that particular input unit. The higher the weight the more important it is. Neural Networks are suitable for training large amounts of data with few inputs. Generally the ANN adjusts the values of the weight to produce a specific target from a particular input. Then it compares the output with the target till a match between the target and the output is found. A neural network can be trained to do a specific function by taking several such inputs.

Punam Bajaj,IJRIT


IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 608-615

Figure 2: Multi Layer Neural Networks

B. Bacterial Foraging Optimization Algorithm (BFOA) Since the last two decades, swarm intelligence has been the focus of many researchers because of its unique behavior inspired by natural systems and inherent from the social insects. Swarm intelligence system consists typically of a population of simple agents who interacts locally with one another and with their environment in a decentralized control system. It is a computational intelligence technique to solve complex real world problems. The inspiration often comes from nature especially biological systems. The agents follow very simple rules and although there is no centralized control structure dictating how individual agents should behave , local , and to a certain degree random , interactions between such agents lead to emergence of “intelligent” global behavior, unknown to the individual agents. Examples in natural systems of Swarm intelligence include colonies of ants and termites, flock of birds, animal herding, bacterial growth and fish schooling [16]. Some human artifacts also fall into the domain of swarm intelligence, notably some multi-robot systems, and also certain computer programs that are written to tackle optimization and data analysis problems [17]. Swarm Prediction has been used in the context of forecasting problems. Different techniques have been used for optimizing accuracy and performance for training artificial neural networks such as evolutionary algorithm, genetic algorithms, particle swarm optimization, ant colony and back propagation algorithm. These algorithms are used for initializing optimal weights, parameters, activation function, and selection of best architecture of neural network. Bacterial foraging Optimization Algorithm (BFOA) is a new comer to the family of nature inspired optimization algorithms like Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO). It is an artificial intelligence technique that can be used to find approximate solutions to extremely difficult or impossible numeric maximization and minimization problems. BFOA is a swarm intelligence technique that models the food seeking and reproductive strategy of common bacteria such as E.Coli in order to solve numeric optimization problems where there is no effective deterministic approach. BFO method was invented by Kevin M.Passino [4] motivated by the natural selection which tends to eliminates the animals with poor foraging strategies and favor those having successful foraging strategies. IV. SOFTWARE DEFECT PREDICTION Software quality is the degree to which software possesses attributes like functionality, efficiency, portability, performance, serviceability, capability, install ability, maintainability, usability and reusability. Software reliability is one of the important factors being considered while ensuring the software quality. Software reliability can be defined as the probability of software to execute fault free in a specified environment for a specified period of time. A software system consists of various modules and any of these modules can contain the fault which can severely affect the reliability of the software.

Punam Bajaj,IJRIT


IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 608-615

Software defect prediction is the process of finding defective modules in the software. To produce high quality software, the final product should have as few defects as possible. Early detection and fixing of software defects could lead to reduced rework effort, development costs and more reliable software. So, the study of the defect prediction is important to achieve software quality. Software Defect Prediction Model refers to those models that try to predict potential software defects from test data. Defect Prediction models are helpful tools for software testing. It will not be possible to eliminate all defects but it is possible to minimize the number of defects and their adverse impact on the software. This study aimed at developing an intelligent data mining system based on neural networks for the prediction of software defects based on software metrics, with the bacterial Foraging Optimization (BFO) algorithm used for learning and to select the best architecture of the neural network. V. LITERATURE REVIEW Considerable research has been carried out on software metrics and defect prediction models. Various types of techniques have been applied for software defect prediction like logistic regression, Decision tree, neural network, Naïve Bayes and many more. In this section, some works related to neural network techniques for software defect prediction are presented. Artificial Neural Networks has gain an explosion of interest over the years, and is being successfully applied across different areas such as engineering, finance, banking, geology, medicine. Neural Networks can be used a predictive model because they are capable of modeling complex functions. Karunanithi et.al [8] presented the neural networks models for software reliability prediction and found that neural network models are better at endpoint prediction than analytical models. Khoshgaftaar et.al [5] introduced the use of neural network as a tool for predicting software quality of a very large telecommunication system, classifying modules into fault prone or non fault prone. They compared the artificial neural network model with a non parametric discriminant model, and found that neural model has better predictive accuracy. In [11], researchers have done a comparison of clustering based approach and neural network based approach on real time data set and found that performance is better in case of neural network approach. Many Researchers have used swarm intelligence optimization techniques with neural networks. In [3], Researchers have combined the Particle Swarm Optimization Algorithm with Back propagation (BP) network. They found that optimized BP network performed better and solved the problem of slow convergence and getting into local minima. In [13], Researchers investigates the use of artificial bee Colony Algorithm to train Multilayer Perceptron Neural Network and found that the performance is better in neural network trained using artificial bee colony algorithm than back propagation. In [12] researchers have used the neural network model to build non parametric model for software reliability prediction and applied Particle Swarm Optimization algorithm for learning process and got good predictive capability for different datasets. Some researchers also proposed a non parametric software reliability prediction systems based on neural network ensembles. In [10] and [2], researchers combined a number of different neural networks and presented the significant improvement in performance of software reliability forecasting over individual neural network based model. In [6], Researchers have proposed a bell fuzzy shaped neural network model that uses a bell shaped curve as its fuzzy members in the hidden layer. From the experiments carried out, they concluded that this proposed fuzzy based neural network performs better than existing neural network and other classification algorithms. The proposed fuzzy based neural network model was able to classify software defects better than Random tree by 14.66 %, CART by 11.41 % and Bayesian Logistic regression by 10.50 % and regular multilayer perceptron model with sigmoidal hidden function by 3.92%.

Punam Bajaj,IJRIT


IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 608-615

VI. PROPOSED METHODOLOGY The fault prediction process consists of two consecutive steps i.e. training and prediction. In training phase, a prediction model is built with previous software metrics and fault data belong to each software module. After this phase, this model is used to predict the fault proneness of modules in the new software version. As the prediction process consists of two consecutive steps i.e. training and testing, the experiments completed in the study are reported using a split of data set into 70 % – 30 % for training and testing purpose respectively. The objective of this study is to measure the accuracy of this neural network and compare it with accuracy of optimized neural network optimized using swarm intelligence algorithm.

Figure 3: Architecture of Neural Network

The Learning Algorithm of neural network: 1) Initialize all the weights for the minimal random. 2) Provide training data set. 3) Compute the output of the hidden layer and output layer to compute the actual output 4) Compute the error between actual output and the desired output 5) Adjust the weight of the output layer. 6) Adjust the weight of the hidden layer. 7) Return to (3) until the error meets the need i.e. actual output is near to desired output.

Punam Bajaj,IJRIT


IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 608-615

Figure 4: Basic Flow Chart of BFO Learning Process and selecting the best architecture


There can be number of defects in any software system. As Occurrence of faults is inevitable; we should try to reduce them to minimum count. The main objective of this paper is to study and understand the concept of software fault prediction using the artificial neural network. This paper introduces the concept of neural model and its architecture. In this study, we have used Software metrics as independent variable and fault proneness as the dependent variable. The neural network approach used here allows the user to directly use the collection of independent variables as the input to the neural net. In future we are going to train a neural network model for software defect prediction and optimize its accuracy using Bacterial Foraging Optimization Algorithm to select the best architecture of neural network.


Punam Bajaj,IJRIT


IJRIT International Journal of Research in Information Technology, Volume 2, Issue 5, May 2014, Pg: 608-615

[1] Gaurav Aggarwal, Dr. V.K Gupta, “Neural Network Approach to measure reliability of software module: A Review” International journal of Advances in Engineering Sciences Vol. 3, Issue 2, April 2013. [2] Jun Zheng, “ Predicting software reliability with neural network ensembles” , Expert Systems with Applications , Volume 36, Issue 2 , Part 1, March 2009, pp2116-2122. [3] Kewen Li, Jisong Kou , Lina Gong , “Predicting software quality by optimized BP network based on PSO,” Journal of computers. Vol 6 No. 1 January 2011. [4] Kevin M.Passino, “Biomimicry of bacterial foraging for distributed optimization and control,” IEEE Control Syst. Mag., Vol.22,no.3,pp.52-67,June,2002 [5] M.Khoshgaftaar, E.D Allen, J.P Hudepohe, S.J. Aud, “Application of neural networks to software quality modeling of a very large telecommunication systems” , IEEE Transaction on Neural Networks Vol. 8, No. 4 pp. 902-909 , 1997. [6] M.V.P Chandra Sekhara Rao , Dr. B Ravendra Babu , Aparna Chaparala , Dr. A Damodaram , “An improved Multiperceptron Neural network model to classify Software defects,” International journal of computer science and information security No.3 No.2 February 2011. [7] Malkit Singh , Dalwinder Singh, “ Software Defect Prediction tool based on neural network,” International Journal of computer Applications (0975-8887). Volume 70- No.22 May 2013. [8] N.Karunamithi, D.Whitley, Y.K. Malaiya, “Using Neural Network In software reliability Prediction” , IEEE Software, Vol.9 , no. 4 , pp. 53-59 , 1992. [9] Naheed Azeem , Shazia Usmani, “Analysis of Data mining based Software Defect Prediction Techniques,” Global Journal of Computer Science and Technology” Volume 11 Issue 16 Version 1.0 September 2011. [10] P.K Kapoor, V.S.S Yadavalli, S.K Khatri, M. Basirzadeh. “ Enhancing software reliability of a complex software system architecture using artificial neural networks ensemble” , International Journal of Reliabilty, Quality, and Safety Engineering, Vol. 8 Issue 3 , 2011, pp. 271-284. [11] Rachna Ratra, Navneet Singh Randhawa, Parneet Kaur, Dr. Gurdev Singh, “ Early Prediction of fault prone modules using clustering based vs. Neural Network Approach in software systems”. International Journal of electronics & Communication Technology Vol. 2 Issue 4, Dec 2011. [12] Rita G. Al gargoor , Nada N. Saleem, “Software Reliability Prediction using Artificial Techniques”. International Journal of Computer Science, Issues, Vol. 10, Issue 4, No.2, July 2013. [13] Solmaz Farshidpour, Farshid Keynia, “ Using Artificial Bee Colony Algorithm for MLP Training on Software Defect Prediction” , Oriental Journal of Computer science & Technology, Vol. 5 No.(2) Pgs. 231-239 December 2012. [14] Yogesh Singh, Arvinder Kaur, Ruchika Malhotra, “ Predicting Testing Effort Using Artifical Neural Network” Proccedings of the World Congress on Engineering and Computer Science 2008, October 22-24,2008, San Francisco, USA. [15] en.wikipedia.org/wiki/software_bug [16] en.wikipedia.org/wiki/Swarm_intelligence [17] www.scholarpedia.org/article/Swarm_intelligence

Punam Bajaj,IJRIT


Prediction of Software Defects Based on Artificial Neural ... - IJRIT

studied the neural network based software defect prediction model. ... Neural Networks models have significant advantage over analytical models because they ...

1MB Sizes 4 Downloads 45 Views

Recommend Documents

View-invariant action recognition based on Artificial Neural ...
View-invariant action recognition based on Artificial Neural Networks.pdf. View-invariant action recognition based on Artificial Neural Networks.pdf. Open.

Development of an artificial neural network software ...
a distance of 1.5m, using a broadband, TEM horn antenna. ... London EC1V 0HB, U.K. .... 6 Comparison between the peaks of the three perpendicular directions.

Aqueous Solubility Prediction of Drugs Based on ...
A method for predicting the aqueous solubility of drug compounds was developed based on ... testing of the predictive ability of the model are described.

Aqueous Solubility Prediction of Drugs Based on ...
Structural parameters used as inputs in a 23-5-1 artificial neural network included 14 atom- type electrotopological ... to ensure that the distribution of properties relevant to ...... The Integration of Structure-Based Drug Design and Combinatorial

Automatic Summarization of Text-Based Documents on Corpus ... - IJRIT
IJRIT International Journal of Research in Information Technology, Volume 2, Issue 1, January 2014, .... P1: S11 = {Mr. Ram started his career in medical field}.

Review Paper on Artificial Neural Network in Data ...
networks have high acceptance ability for high accuracy and noisy data and are preferable ... applications such as identify fraud detection in tax and credit card.

Google hostload prediction based on Bayesian ... - Research at Google
1. Introduction. Accurate prediction of the host load in a Cloud computing data .... the batch scheduler and its scheduling strategy. Our objective ... whole process.