Decision tree for classification problems

This post introduces decision tree for classification problems. A decision tree is a representation for classifying instances. Decision tree learning is one of the most successful techniques for supervised classification learning.

  • Introduction to decision tree for classification problems
  • Measuring decision tree performance

You may also be interested in What is data mining?

Introduction to decision tree for classification problems

The simplest decision tree has only one test condition and two possible outcomes – one internal node and two branches.

A decision tree is comprised of a root node, a child, an arc or edge (attribute value), and a leaf or class. On starting, the whole training set is considered as the root.

A decision tree or a classification tree is a tree in which each internal (non-leaf) node is labeled with an input feature. The arcs coming from a node labeled with a feature are labeled with each of the possible values of the feature. Each leaf of the tree is labeled with a class or a probability distribution over the classes. (Poole & Mackworth, 2010)

The goal of a decision tree algorithm is to create a decision-making model that predicts the value of a target variable (class) by learning simple decision rules inferred from prior data (training data), and “for which the decision tree uses the tree representation to solve the problem in which the leaf node corresponds to a class label and attributes are represented on the internal node of the tree” (Sharma, 2021).

Decision trees classify instances by sorting them down the tree from the root to some leaf node, which provides the classification of the instance. An instance is classified by starting at the root node of the tree, testing the attribute specified by this node, then moving down the tree branch corresponding to the value of the attribute … This process is then repeated for the subtree rooted at the new node. (geeksforgeeks.org)

To predict a class label for a record we start from the root of the tree. “We compare the values of the root attribute with the record’s attribute. On the basis of comparison, we follow the branch corresponding to that value and jump to the next node” (Chauhan, 2021).

The builder of a decision tree classifier needs to make two key decisions: 1) which attributes to use for test conditions, and 2) in what order. “Answering these two questions differently forms different decision tree algorithms. Different decision trees can have different prediction accuracy on the test dataset” (Li, 2021).

Decision tree algorithms can be applied in marketing decision making to classify (e.g., the propensity to act in a certain way, such as to respond to a promotional ad) or to score a target variable (high or low). The categorical attributes of a dataset used for customer segmentation and targeting or used to derive a customer credit score can include age, gender, education level, income, and number of children.

Attributes in a dataset can be categorical (preferred for decision tree), continuous (e.g., >60 or <60), or binary (yes, no).

Introduction to Decision Trees (Chu & Wesslén, 2015A):

https://youtube.com/watch?v=jzoDKtnTPpg%3Ffeature%3Doembed

Measuring decision tree performance

The classification performance of a decision tree model can be described using a confusion matrix, or measured by an accuracy metrics (sensitivity, specificity, correctness): 90% accuracy can be considered good enough.

Split purity can be measured using a Gini index or entropy. The Gini index tends to 1 and equals P1 +P2 … +Pn. The higher the Gini index, the better, the more pure the split. Entropy is a measure of purity, order, or uncertainty (randomness in a dataset). Entropy controls how a decision tree decides to split the data. The lower the entropy the better the split. The maximum value for Gini impurity is 0.5 whereas the maximum value for entropy is 1.

Since the Gini Impurity does not contain any logarithmic function to calculate, it takes less computational time in comparison to entropy.

In a decision tree, each internal node (non-leaf node) denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node (or terminal node) holds a class label … Once a decision tree has been constructed, it can be used to classify a test dataset, which is also called deduction. (Li, 2021)

Multistage Decision Trees (Chu & Wesslén, 2015B):

https://youtube.com/watch?v=yJf96EPBNz4%3Ffeature%3Doembed

Decision tree portfolio project: use the Titanic problem to demonstrate how to build decision trees for prediction.

Key references

Chauhan, Nagesh Singh. (2021). Decision tree algorithm, explained. KDnuggets. https://www.kdnuggets.com/2020/01/decision-tree-algorithm-explained.html

Chu, Charlene, & Wesslén, Maria. [UTM MCS Math Videos]. (2015A, Sep 10). Introduction to Decision Trees [Video]. YouTube. https://www.youtube.com/watch?v=jzoDKtnTPpg

Chu, Charlene, & Wesslén, Maria. [UTM MCS Math Videos]. (2015B, Sep 10). Multistage Decision Trees [Video]. YouTube. https://www.youtube.com/watch?v=yJf96EPBNz4

Li, Gangmin. (2021-03-07). Chapter 8 Prediction with decision trees. Do A Data Science Project in 10 Days. https://bookdown.org/gmli64/do_a_data_science_project_in_10_days/prediction-with-decision-trees.html

Poole, David, & Mackworth, Alan. (2010). Foundations of computational agents: Learning decision trees. Artificial Intelligence. https://artint.info/html/ArtInt_177.html

Sharma, Akshay (akshaysharma06). (2021, February 25). Machine learning 101: Decision tree algorithm for classification. Analytics Vidhya. https://www.analyticsvidhya.com/blog/2021/02/machine-learning-101-decision-tree-algorithm-for-classification

Related content

Basic Statistics Mini-Course

Google Data Analytics Professional Certificate quiz answers

Google IT Support Professional Certificate quiz answers

How to break into information security

How to get CCNA certification

IT career paths – everything you need to know

Predictive analytics application areas and process

The Security Operations Center (SOC) career path

What is the Google Data Analytics certification?

Back to DTI Courses

Other content

1st Annual University of Ottawa Supervisor Bullying ESG Business Risk Assessment Briefing

Disgraced uOttawa President Jacques Frémont ignores bullying problem

How to end supervisor bullying at uOttawa

PhD in DTI uOttawa program review

Rocci Luppicini – Supervisor bullying at uOttawa case updates

The case for policy reform: Tyranny

The trouble with uOttawa Prof. A. Vellino

The ugly truth about uOttawa Prof. Liam Peyton

uOttawa engineering supervisor bullying scandal

uOttawa President Jacques Frémont ignores university bullying problem

uOttawa Prof. Liam Peyton denies academic support to postdoc

Updated uOttawa policies and regulations: A power grab

What you must know about uOttawa Prof. Rocci Luppicini

Why a PhD from uOttawa may not be worth the paper it’s printed on

Why uOttawa Prof. Andre Vellino refused academic support to postdoc

Supervisor Bullying

Text copying is disabled!