JE – 832
*JE832*
VII Semester B.E. (CSE/ISE) Degree Examination, June/July 2013 (2K6 Scheme) CI – 7.2 : DATA MINING AND ALGORITHMS Time : 3 Hours
Max. Marks : 100
Instruction : Answer any five questions, selecting atleast 2 questions from each Part. PART – A 1. a) Explain the various steps involved in Knowledge Discovery in Databases with a neat block diagram.
8
b) Explain the following : i) Multimedia database ii) Web database iii) Spatial database.
6
c) List out some of the major challenges in data mining.
6
2. a) What is data preprocessing ? Explain its importance.
6
b) Why is data cleaning done ? List out the methods for cleaning the data.
6
c) What are the different data transformation techniques ? Give any three methods for data normalization.
8
3. a) What is a data warehouse ? Explain the 3-tier architecture of a data warehouse. b) Define Association Rule Mining. Explain the various types of Association Rules.
10 10
4. a) Using Apriori algorithm, findout the Large itemsets for the database shown below, taking the minimum support as 2 TID Items bought 1
M, O, N, K, E, Y
2
D, O, N, K, E, Y
3
M, A, K, E
4
M, U, C, K, Y
5
C, O, O, K, I, E.
10 P.T.O.
JE – 832
*JE832*
-2-
b) Generate the frequent itemsets for the database shown below using Frequent Pattern tree. Take the minimum support as 2. TID
Items
1
a, b
2
b, c, d
3
a, c, d, e
4
a, d, e
5
a, b, c
6
a, b, c, d
10 PART – B
5. a) With a neat diagram, explain the process of classification.
6
b) Define a Decision tree and discuss some of the issues in constructing a decision tree.
6
c) For the training set shown below, construct a Decision Tree. Training set for classifying mammals/non-mammals Name
Body-Temperature Gives Birth
Four legged
Hibernates Class Label
Salamander Cold-Blooded
no
yes
yes
no
Guppy
Cold-Blooded
yes
no
no
no
Eagle
Warm-Blooded
no
no
no
no
Poorwill
Warm-Blooded
no
no
yes
no
Platypus
Warm-Blooded
no
yes
yes
yes
6. a) With a neat diagram, explain how Neural Networks can be used for classification.
8
b) With an example, explain Bayesian Belief Networks.
8
c) Compare Lazy learners with Eager Learners.
4
*JE832*
-3-
JE – 832
7. a) Explain the important features of a Good Clustering Algorithm. b) Discuss the various types of data on which clustering can be done.
6 6
c) Using k-means algorithm, cluster the following eight points (with (x,y)) into three clusters. A1(2,10), A2 (2, 5), A3 (8,4) B1(5,8), B2(7,5), B3(6,4) C1(1, 2), C 2(4,9)
8
8. a) Classify the various clustering algorithms.
8
b) Describe each of the following clustering algorithms in terms of the following criteria i) Shapes of clusters that can be determined ii) Input parameters that must be specified iii) Limitations. i) K-means ii) k-medoids iii) DBScan iv) Sting v) BIRCH
12 ________