This post introduces decision tree for classification problems. A decision tree is a representation for classifying instances. Decision tree learning is one of the most successful techniques for supervised classification learning.
- Introduction to decision tree for classification problems
- Measuring decision tree performance
You may also be interested in What is data mining?
Introduction to decision tree for classification problems
The simplest decision tree has only one test condition and two possible outcomes – one internal node and two branches.
A decision tree is comprised of a root node, a child, an arc or edge (attribute value), and a leaf or class. On starting, the whole training set is considered as the root.
A decision tree or a classification tree is a tree in which each internal (non-leaf) node is labeled with an input feature. The arcs coming from a node labeled with a feature are labeled with each of the possible values of the feature. Each leaf of the tree is labeled with a class or a probability distribution over the classes. (Poole & Mackworth, 2010)
The goal of a decision tree algorithm is to create a decision-making model that predicts the value of a target variable (class) by learning simple decision rules inferred from prior data (training data), and “for which the decision tree uses the tree representation to solve the problem in which the leaf node corresponds to a class label and attributes are represented on the internal node of the tree” (Sharma, 2021).
Decision trees classify instances by sorting them down the tree from the root to some leaf node, which provides the classification of the instance. An instance is classified by starting at the root node of the tree, testing the attribute specified by this node, then moving down the tree branch corresponding to the value of the attribute … This process is then repeated for the subtree rooted at the new node. (geeksforgeeks.org)
To predict a class label for a record we start from the root of the tree. “We compare the values of the root attribute with the record’s attribute. On the basis of comparison, we follow the branch corresponding to that value and jump to the next node” (Chauhan, 2021).
The builder of a decision tree classifier needs to make two key decisions: 1) which attributes to use for test conditions, and 2) in what order. “Answering these two questions differently forms different decision tree algorithms. Different decision trees can have different prediction accuracy on the test dataset” (Li, 2021).
Decision tree algorithms can be applied in marketing decision making to classify (e.g., the propensity to act in a certain way, such as to respond to a promotional ad) or to score a target variable (high or low). The categorical attributes of a dataset used for customer segmentation and targeting or used to derive a customer credit score can include age, gender, education level, income, and number of children.
Attributes in a dataset can be categorical (preferred for decision tree), continuous (e.g., >60 or <60), or binary (yes, no).
Introduction to Decision Trees (Chu & Wesslén, 2015A):
https://youtube.com/watch?v=jzoDKtnTPpg%3Ffeature%3Doembed
Measuring decision tree performance
The classification performance of a decision tree model can be described using a confusion matrix, or measured by an accuracy metrics (sensitivity, specificity, correctness): 90% accuracy can be considered good enough.
Split purity can be measured using a Gini index or entropy. The Gini index tends to 1 and equals P1 +P2 … +Pn. The higher the Gini index, the better, the more pure the split. Entropy is a measure of purity, order, or uncertainty (randomness in a dataset). Entropy controls how a decision tree decides to split the data. The lower the entropy the better the split. The maximum value for Gini impurity is 0.5 whereas the maximum value for entropy is 1.
Since the Gini Impurity does not contain any logarithmic function to calculate, it takes less computational time in comparison to entropy.
In a decision tree, each internal node (non-leaf node) denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node (or terminal node) holds a class label … Once a decision tree has been constructed, it can be used to classify a test dataset, which is also called deduction. (Li, 2021)
Multistage Decision Trees (Chu & Wesslén, 2015B):
https://youtube.com/watch?v=yJf96EPBNz4%3Ffeature%3Doembed
Decision tree portfolio project: use the Titanic problem to demonstrate how to build decision trees for prediction.
Key references
Chauhan, Nagesh Singh. (2021). Decision tree algorithm, explained. KDnuggets. https://www.kdnuggets.com/2020/01/decision-tree-algorithm-explained.html
Chu, Charlene, & Wesslén, Maria. [UTM MCS Math Videos]. (2015A, Sep 10). Introduction to Decision Trees [Video]. YouTube. https://www.youtube.com/watch?v=jzoDKtnTPpg
Chu, Charlene, & Wesslén, Maria. [UTM MCS Math Videos]. (2015B, Sep 10). Multistage Decision Trees [Video]. YouTube. https://www.youtube.com/watch?v=yJf96EPBNz4
Li, Gangmin. (2021-03-07). Chapter 8 Prediction with decision trees. Do A Data Science Project in 10 Days. https://bookdown.org/gmli64/do_a_data_science_project_in_10_days/prediction-with-decision-trees.html
Poole, David, & Mackworth, Alan. (2010). Foundations of computational agents: Learning decision trees. Artificial Intelligence. https://artint.info/html/ArtInt_177.html
Sharma, Akshay (akshaysharma06). (2021, February 25). Machine learning 101: Decision tree algorithm for classification. Analytics Vidhya. https://www.analyticsvidhya.com/blog/2021/02/machine-learning-101-decision-tree-algorithm-for-classification
Related content
Google Data Analytics Professional Certificate quiz answers
Google IT Support Professional Certificate quiz answers
How to break into information security
IT career paths – everything you need to know
Predictive analytics application areas and process
The Security Operations Center (SOC) career path
What is the Google Data Analytics certification?
Back to DTI Courses
Other content
1st Annual University of Ottawa Supervisor Bullying ESG Business Risk Assessment Briefing
Disgraced uOttawa President Jacques Frémont ignores bullying problem
How to end supervisor bullying at uOttawa
PhD in DTI uOttawa program review
Rocci Luppicini – Supervisor bullying at uOttawa case updates
The case for policy reform: Tyranny
The trouble with uOttawa Prof. A. Vellino
The ugly truth about uOttawa Prof. Liam Peyton
uOttawa engineering supervisor bullying scandal
uOttawa President Jacques Frémont ignores university bullying problem
uOttawa Prof. Liam Peyton denies academic support to postdoc
Updated uOttawa policies and regulations: A power grab
What you must know about uOttawa Prof. Rocci Luppicini
Why a PhD from uOttawa may not be worth the paper it’s printed on
Why uOttawa Prof. Andre Vellino refused academic support to postdoc