0% found this document useful (0 votes)
271 views

2 - Decision Tree

The document discusses decision tree classification. It covers the basics of decision tree induction including decision nodes, leaf nodes, and paths. It also discusses attribute selection measures like information gain and gini index. The document outlines the steps for decision tree construction and describes pruning algorithms. It discusses scalability issues and algorithms like SLIQ and SPRINT that address handling large datasets. Finally, it lists some common applications and issues related to classification.

Uploaded by

bandaru_jahnavi
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
271 views

2 - Decision Tree

The document discusses decision tree classification. It covers the basics of decision tree induction including decision nodes, leaf nodes, and paths. It also discusses attribute selection measures like information gain and gini index. The document outlines the steps for decision tree construction and describes pruning algorithms. It discusses scalability issues and algorithms like SLIQ and SPRINT that address handling large datasets. Finally, it lists some common applications and issues related to classification.

Uploaded by

bandaru_jahnavi
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 23

2 - decision tree classification

Classification is a supervised learning i.e. we can predict input and out values, Classification is divided into groups but not necessarily similar properties is called Classification. Decision Tree Induction is developed by Ross Quinlan, decision tree algorithm known as ID3 (Iterative Dichotomiser). Decision tree is a classifier in the form of a tree structure

Decision node: specifies a test on a single attribute Leaf node: indicates the value of the target attribute Arc/edge: split of one attribute Path: a disjunction of test to make the final decision

ID3, C4.5, and CART are greedy algorithms for the induction of decision trees. Each algorithm uses an attribute selection measure to select the attribute tested for each nonleaf node in the tree. Pruning algorithms attempt to improve accuracy by removing tree branches reflecting noise in the data. Early decision tree algorithms typically assume that the data are memory resident a limitation to data mining on large databases. Several scalable algorithms, such as SLIQ, SPRINT, and RainForest, have been proposed to address this issue.
2

Why Decision Tree :


Decision trees are powerful and popular tools for classification and prediction. Decision trees represent rules, which can be understood by humans and used in knowledge system such as database.

Key Requirements :
Attribute-value description: Object or case must be expressible in terms of a fixed collection of properties or attributes (e.g., hot, mild, cold). Predefined classes (Target values): the target function has discrete output values (Boolean or Multiclass) Sufficient data: Enough training cases should be provided to learn the model.

TYPES OF CLASSIFICATION TECHNIQUES

1. Decision Tree

2. Bayesian classification
3. Rule-based classification

4. Prediction: Accuracy and error measures

4) Prediction: Accuracy and error measures

1) Decision Tree

CLASSIFICATION TYPES

3) Rule-based classification

2) Bayesian classification

FIG: TYPES OF CLASSIFICATION TECHNIQUES


5

Classification by Decision Tree Induction consists of


1. Decision Tree Induction 2. Attribute Selection Measures
i. ii. Information gain Gain ratio

iii. Gini index

3. Tree Pruning 4. Scalability and Decision Tree Induction


6

1. Decision Tree Induction: Decision tree induction is the learning of decision trees from class-labeled training tuples. A decision tree is a flowchart-like tree structure, where each internal node (nonleaf node)denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node (or terminal node) holds a class label. The topmost node in a tree is the root node. Decision trees classify instances or examples by starting at the root of the tree and moving through it until a leaf node. It can generate understandable rules. It perform classification without much computation. It can handle continuous and categorical variables. It provide a clear indication of which fields are most important for prediction or classification. It is not suitable for prediction of continuous attribute. 7 It Perform poorly with many class and small data.

Steps of Decision Tree Construction


1. Select the best feature as the root node of the whole tree 2. After partition by this feature, select the best feature (w.r.t the subset of training data) as the root node of this sub-tree 3. Recursively, until the partitions become pure or almost pure

Fig: Class-labeled training tuples from the AllElectronics customer database.


9

Let A be the splitting attribute. A has v distinct values, fa1, a2, : : : , avg, based on the training data.
1. A is discrete-valued: In this case, the outcomes of the test at node N correspond directly to the known values of A. 2. A is continuous-valued: In this case, the test at node N has two possible outcomes, corresponding to the conditions A split point and A > split point, respectively. 3. A is discrete-valued and a binary tree must be produced (as dictated by the attribute selection measure or algorithm being used)
10

age?

<=30

overcast 31..40

>40

student? no
no

yes

credit rating? excellent fair


yes

yes
yes

Fig: Decision tree for the concept buys computer, indicating whether a customer at AllElectronics is likely to purchase a computer. Each internal (non-leaf) node represents a test on an attribute. Each leaf node represents a class (either buys computer = yes or buys computer = no). 11

Algorithm: Generate decision tree. Generate a decision tree from the training tuples of data partition D. Input: i. Data partition, D, which is a set of training tuples and their associated class labels; ii. Attribute_list, the set of candidate attributes; iii. Attribute_selection_method, a procedure to determine the splitting criterion that best partitions the data tuples into individual classes. This criterion consists of a splitting_attribute and, possibly, either a split point or splitting subset. Output: A decision tree.
12

Algorithm

13

2. Attribute Selection Measure: Select the attribute with the highest information gain. An attribute selection measure is a heuristic for selecting the splitting criterion that best separates a given data partition, D, of classlabeled training tuples into individual classes. i. Information gain: ID3 uses information gain as its attribute selection measure. This measure is based on pioneering work by Claude Shannon on information theory, which studied the value or information content of messages. Let node N represent or hold the tuples of partition D.

14

ii. Gain ratio: The information gain measure is biased toward tests with many outcomes. That is, it prefers to select attributes having a large number of values.

iii. Gini index: The Gini index is used in CART. Using the notation described above, the Gini index measures the impurity of D, a data partition or set of training tuples, as

Where pi is the probability that a tuple in D belongs to class Ci and is estimated by |Ci,D|/|D|
15

3. Tree Pruning: Pruning algorithms attempt to improve accuracy by removing tree branches reflecting noise in the data. There are two common approaches to tree pruning: prepruning and postpruning. i. Prepruning approach, a tree is pruned by halting its construction early (e.g., by deciding not to further split or partition the subset of training tuples at a given node). ii. Postpruning, which removes subtrees from a fully grown tree. A subtree at a given node is pruned by removing its branches and replacing it with a leaf. The leaf is labeled with the most frequent class among the subtree being replaced.
16

Fig: An unpruned decision tree

Fig: A pruned version of it.

17

4. Scalability and Decision Tree Induction: The efficiency of existing decision tree algorithms, such as ID3, C4.5, and CART, has been well established for relatively small data sets. Efficiency becomes an issue of concern when these algorithms are applied to the mining of very large realworld databases. The restriction that the training tuples should reside in memory. It consists of (a) Repetition (b) Replication (c) SLIQ (d) SPRINT
18

a) Repetition

Fig: An example of subtree (a) Repetition (where an attribute is repeatedly tested along a given branch of the tree, e.g., age)
19

b) Replication

Fig: An example of subtree (b) Replication (where duplicate subtrees exist within a tree, such as the subtree headed by the node credit rating?).
20

c) SLIQ

Fig: Data Set

21 Fig: Attribute list and class list data structures used in SLIQ for the tuple data

d) SPRINT

Fig: Attribute list data structure used in SPRINT for the tuple data

22

Issues of Classification
1. 2. 3. 4. 5. 1. 2. 3. 4. 5. 6. Accuracy Training time Robustness Interpretability Scalability Credit approval Target marketing Medical diagnosis Fraud detection Weather forecasting Stock Marketing
23

Typical applications

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy