Calculating the Mode of a Dataset

The mode is a measure of central tendency that identifies the most frequently occurring value(s) in a dataset. Unlike the mean and median, the mode does not require numerical calculations but instead focuses on how often values appear. In this section, we will define the mode, discuss its characteristics, and explore examples of how to determine and interpret it.

Mode

What is the Mode?

The mode of a dataset is the value(s) that occur locally with the highest frequency.

What Does "Occur Locally" Mean?

When discussing the shape of histograms, we encountered unimodal, multimodal, and uniform distributions. The mode corresponds to the peaks in these distributions.

  • A unimodal distribution has one peak.
    The red curve drawn over the histogram looks like a bell.
  • A multimodal distribution has multiple peaks with similar frequencies.
    A bimodal distribution with its two peaks labeled.
  • A uniform distribution has no distinct peaks, meaning no mode exists.
    A uniform distribution has bars that are all approximately the same height.

Identifying these peaks precisely requires advanced mathematical techniques, such as calculus, even though many multimodal distributions are intuitive and can be visualized easily with histograms, stem-and-leaf plots, and dot plots. (Human height, for example, is bimodal because there is a mode for men's height and a mode for women's height since, on average, men tend to be about five inches taller than women.) Therefore, we will refine our definition of mode to avoid local considerations.

Definition: Mode

The mode of a dataset is the value(s) that occur most frequently.

What Changed in This Definition?

By removing the term "occur locally," we ensure that all identified modes have the same frequency. This revised definition makes it easier to identify all the modes at a glance from frequency distributions.

Example

Find the mode of the 2016-2017 tuition and fees (in thousands of dollars) for the top 14 universities in the U.S. by hand.

2016-2017 Tuition and Fees (in $1000s)
Tuition and Fees
45 47 52 49 55 48 48
51 51 50 51 48 51 51

Solution

By observation, the most frequently occurring value is 51, which appears five times.

A copy of the tuition fees table with the five instances of 51 enclosed each in a black box.

Thus, the mode is 51.

$$\tag*{\(\blacksquare\)}$$

Example

The following LSAT scores for a sample of 50 students are recorded below. Use the Summary Statistics Calculator to determine if a mode exists, and if it does, identify the mode(s). Also, classify the dataset as unimodal, multimodal, or having no mode.

Sample of 50 LSAT Scores
LSAT Scores
174 172 169 176 169 170 175 171 168 177
165 180 173 166 178 170 174 167 179 172
163 181 171 164 177 169 175 168 180 170

Solution

Copy the data and enter it into the Summary Statistics Calculator.
A screenshot of the Summary Statistics Calculator showing that the average value is 171.68.

Click on the Mode Checkbox.

The mode checkbox is revealed, showing that 169 and 170 are the two modes and indicates this is a bimodal set.

The tool reveals that 169 and 170 are the modes of the data set, and indicates that this is a bimodal set, which is a type of multimodal distribution.

$$\tag*{\(\blacksquare\)}$$

The mode is particularly useful to summarize certain types of qualitative data. Unfortunately, the Summary Statistics Calculator does not handle qualitative data. Therefore, we will use the Frequency Distribution Tool to make a frequency distribution of the data and then find the data point(s) with the highest frequency.

Example

The following dataset represents the size of shirts sold over the last 30 days at a clothing retailer. Use the Frequency Distribution Tool to determine the mode of the dataset.

Shirt Sizes Sold Over the Last 30 Days
Shirt Sizes
Small Medium Large X-Large X-Large Medium Large X-Large Small X-Large
Medium X-Large X-Large Large Small X-Large Medium X-Large Large X-Large
X-Large X-Large Large Small Medium X-Large X-Large X-Large Large Medium
X-Large Large Medium X-Large X-Large Small Medium Large X-Large X-Large
Large X-Large Medium Small X-Large X-Large X-Large Large Small Medium

Solution

Load the data into the Frequency Distribution Tool, and nothing should appear since we only make classes with quantitative data.

The data has been loaded into the GeoGebra tool.  Nothing is displayed yet.

To display a frequency distribution of the raw data, deselect the 'Organize Data Into Classes' checkbox. The distribution will then appear on the right.

The organize data into classes checkbox is deselected. The mode is X-Large with a frequency of 23.

From the distribution, the most frequently sold size was X-Large, with a count of 23.

Conclusion

The mode helps identify the most common values in a dataset, making it useful for analyzing both quantitative and qualitative data. Unlike the mean and median, the mode may not always exist and can have multiple values. By understanding how to determine the mode and classify distributions as unimodal, multimodal, or having no mode, we gain deeper insights into data patterns.