L22 DecisionTrees
L22 DecisionTrees
Arun Kumar
IIT Ropar
1 / 14
Outlines
2 / 14
History
3 / 14
Shannon’s Information
4 / 14
Measure of Impurity
Entropy
Entropy for a set S is givne by
X
H(S) = − p(c) log2 p(c),
c∈C
Gini
Gini impurity for a set S, where the target variable takes N different labels
X N
X
Gini(S) = p(i)p(j) = 1 − p(i)2 ,
i̸=j i=1
5 / 14
Decision Tree Introduction
6 / 14
Sample Decision Tree
7 / 14
Algorithms to build decision trees
8 / 14
ID3 Algorithm based on Weather Data
1
1
Based on “Machine Learning", by T. Mitchell, Ch. 3
9 / 14
Final Decision Tree
10 / 14
Recursive Binary Tree Splitting Algorithm
where ŷR1 is the mean response for the training observations in R1 (j, s),
and ŷR1 is the mean response for the training observations in R2 (j, s).
2
2
Based on “An Introduction to Statistical Learning with Applications in R ", Chapter 8, Page 306
11 / 14
Data
12 / 14
Final Decision Tree Based On Python Sklearn
13 / 14
References
• https://www.superdatascience.com/
14 / 14