Decision Tree
Decision Tree
•Root Node: Root node is from where the decision tree starts. It represents
the entire dataset, which further gets divided into two or more homogeneous
sets.
•Leaf Node: Leaf nodes are the final output node, and the tree cannot be
segregated further after getting a leaf node.
•Splitting: Splitting is the process of dividing the decision node/root node
into sub-nodes according to the given conditions.
•Branch/Sub Tree: A tree formed by splitting the tree.
•Pruning: Pruning is the process of removing the unwanted branches from
the tree.
•Parent/Child node: The root node of the tree is called the parent node,
and other nodes are called the child nodes.
How does the Decision Tree
algorithm Work
• Step-1: Begin the tree with the root node, says S, which contains
the complete dataset.
• Step-2: Find the best attribute in the dataset using Attribute
Selection Measure (ASM).
• Step-3: Divide the S into subsets that contains possible values
for the best attributes.
• Step-4: Generate the decision tree node, which contains the best
attribute.
• Step-5: Recursively make new decision trees using the subsets
of the dataset created in step -3. Continue this process until a
stage is reached where you cannot further classify the nodes and
called the final node as a leaf node.
Example
Suppose there is a candidate who
has a job offer and wants to
decide whether he should accept
the offer or Not. So, to solve this
problem, the decision tree starts
with the root node (Salary
attribute by ASM). The root node
splits further into the next
decision node (distance from the
office) and one leaf node based
on the corresponding labels. The
next decision node further gets
split into one decision node (Cab
facility) and one leaf node. Finally,
the decision node splits into two
leaf nodes (Accepted offers and
Declined offer).
Example