Data collection in statistics

Data collection in statistics is a process of gathering information from the relevant sources to find a solution to a research problem. Data analysts gather data from primary and secondary sources to help them find an answer to a data-related business question (e.g., what was the percentage of sales change over the past six months?). This post constitutes Lesson 1 of the Basic Statistics Mini-Course.

You may be also interested in Data presentation in statistics.

Key concepts covered in this post: data sources, random samples, table of random digits, simulations, tally charts, data collection methods, types of data.

Data sources and random samples

Data sources include surveys, questionnaires, interviews, observations, experiments, simulations, books, newspapers, and databases (see Data collection methods).

A population is the whole group of people or elements being studied. Often it is not practical or possible to study the whole population. A sample is part of the population from which data will actually be collected. A random sample is a sample in which each member of the population has an equal chance of being selected. Random representative samples of the population are used to avoid biasing the results (to ensure the accuracy of predictions).

Methods of selecting random samples include drawing names from a hat, selecting every third name in a phone book, and using a table of random digits. Scientific calculators have the function Random or Rand to generate random numbers. 

A simulation can be performed when data collection is impractical by conducting an experiment that has the same probability. Two common techniques would be tossing a coin when the probability of the event being investigated is 1/2, and spinning a spinner wheel divided into the number of sections corresponding to (has the same probability as) the test phenomenon.

Data can be recorded in a tally chart to keep track of the totals. Example: A restaurant owner wants to know which dishes are the most popular among her customers. She records the results of a survey in a tally chart.

DishTally (frequency)RatioPercentage
Pizza99/2733.3%
Chicken fingers66/2722.2%
Soup and sandwich 1212/2744.4%
Total2727/27100%

Data collection methods

Qualitative research data collection methods tend to be purposeful (in sampling) and are usually case study based. Common qualitative data collection methods include in-depth interviews, document reviews, and observational research methods.

Quantitative research data collection methods tend to rely on random sampling. They include surveys/questionnaires, polls, and experiments.

Types of data

There are two types of data: Qualitative and Quantitative. Those are further classified into four types of data: nominal, ordinal, discrete, and Continuous.

types-of-data-GeeksforGeeks-dot-org-1863x675px
Types of data (courtesy of GeeksforGeeks.org)

Qualitative or categorical data are data that cannot be measured or counted in the form of numbers.

Qualitative data are further classified into two types: nominal and ordinal. Nominal data are used to label variables without any order or quantitative value. Examples of nominal data are gender, hair colour, and ethnicity. Ordinal data have natural ordering where a number is present in some kind of order by their position on the scale. The values of ordinal data have some kind of a relative position. Examples of ordinal data are rankings (first, second, and third), letter grades (A, B, and C), and education level (primary, secondary, and higher).

Quantitative data can be expressed in numerical values. Quantitative data can be represented by a wide variety of graphs and charts, such as bar graphs, histograms, and scatter plots.

Quantitative data are further classified into two types: discrete and continuous. Discrete data are integers or whole numbers. Examples of discrete data are the number of students in a class, the number of players participating in a game, and days of the week. Continuous data are in the form of fractional numbers. Continuous data can be divided into smaller variables. A continuous variable can take any value within a range. Examples of continuous data are the height of children, the speed of cars, and market share price.

Key references

GeeksforGeeks. (2021, Oct 29). Explain different types of data in statistics. https://www.geeksforgeeks.org/explain-different-types-of-data-in-statistics/

Great Learning Team. (2021, Sep 27). 4 Types Of Data – Nominal, Ordinal, Discrete and Continuous. https://www.mygreatlearning.com/blog/types-of-data/

Sinclair, Margaret. (1999). How to Get an A in: Statistics & Data Analysis. Coles Publishing, Toronto.

Next post: Part 2: Data presentation in statistics

Key concepts: Frequency distributions and histograms, graphical representations of data, comparing graphs, stem and leaf plots.

Back to Basic Statistics Mini-Course

Back to DTI Courses

Related content

Analyzing data spread

Bivariate statistics or two variable statistics

Data presentation in statistics

Measures of central tendency

Normal approximation to the binomial distribution

Normal distribution or Gaussian distribution

What is data mining?

Other content

1st Annual University of Ottawa Supervisor Bullying ESG Business Risk Assessment Briefing

Disgraced uOttawa President Jacques Frémont ignores bullying problem

How to end supervisor bullying at uOttawa

PhD in DTI uOttawa program review

Rocci Luppicini – Supervisor bullying at uOttawa case updates

The case for policy reform: Tyranny

The trouble with uOttawa Prof. A. Vellino

The ugly truth about uOttawa Prof. Liam Peyton

uOttawa engineering supervisor bullying scandal

uOttawa President Jacques Frémont ignores university bullying problem

uOttawa Prof. Liam Peyton denies academic support to postdoc

Updated uOttawa policies and regulations: A power grab

What you must know about uOttawa Prof. Rocci Luppicini

Why a PhD from uOttawa may not be worth the paper it’s printed on

Why uOttawa Prof. Andre Vellino refused academic support to postdoc

Supervisor Bullying

Text copying is disabled!