Source codeVideos

Command Palette

Search for a command to run...

Statistics

Quartiles for Ungrouped Data

What Are Quartiles?

The median is like a ruler that divides data into two equal parts, right in the middle (50%). Well, there's another friend of the median, called quartiles.

If the median divides data into two, quartiles are even better; they divide the sorted data into four equal parts! Imagine you have a chocolate bar, and you break it into four equal pieces. Quartiles are the breaking points.

There are three quartile breaking points:

  1. Lower Quartile (Q1Q_1): This is the first break. It separates the smallest 25% of the data from the rest. Like the first quarter of the chocolate.
  2. Middle Quartile (Q2Q_2): This is the median! It's exactly in the middle, dividing the data in half (50% left, 50% right). Like the break in the middle of the chocolate.
  3. Upper Quartile (Q3Q_3): This is the last break. It separates the smallest 75% of the data from the largest 25%. Like the boundary after three-quarters of the chocolate.

So, Q1Q_1, Q2Q_2, and Q3Q_3 divide our data into four small groups with the same number of data points (25% each).

How to Find the Position of Quartiles

Okay, now how do we know the position (rank) of Q1Q_1, Q2Q_2, and Q3Q_3 in our ordered data?

Assume we have nn data points that we have sorted from smallest to largest.

Q1 (Lower Quartile)

The formula is simple:

Position of Q1=Data point at 14(n+1)\text{Position of } Q_1 = \text{Data point at }\frac{1}{4}(n+1)
  • If the result is a whole number, for example 5, then Q1Q_1 is the value of the 5th data point.
  • If the result has a decimal, for example 5.255.25, then Q1Q_1 lies between the 5th and 6th data points. (There's a way to calculate its value later, but for now, we're just finding the position).

Simple Example:

Suppose we have 20 data points (n=20n=20).

Position of Q1Q_1 = Data point at 14(20+1)\frac{1}{4}(20+1) = Data point at 214\frac{21}{4} = Data point at 5.25.

This means Q1Q_1 is between the 5th and 6th data points.

Q2 (Median or Middle Quartile)

This is the median, so the formula is:

Position of Q2=Data point at 12(n+1)\text{Position of } Q_2 = \text{Data point at }\frac{1}{2}(n+1)

The rules are the same as for Q1Q_1:

  • If the result is a whole number, say 10, Q2Q_2 is the value of the 10th data point.
  • If the result has a decimal, say 10.5, Q2Q_2 is between the 10th and 11th data points.

Simple Example (n=20n=20):

Position of Q2Q_2 = Data point at 12(20+1)\frac{1}{2}(20+1) = Data point at 212\frac{21}{2} = Data point at 10.5.

This means Q2Q_2 (the median) is between the 10th and 11th data points.

Q3 (Upper Quartile)

The formula is similar again:

Position of Q3=Data point at 34(n+1)\text{Position of } Q_3 = \text{Data point at }\frac{3}{4}(n+1)

The rules are exactly the same:

  • If the result is a whole number, say 15, Q3Q_3 is the value of the 15th data point.
  • If the result has a decimal, say 15.75, Q3Q_3 is between the 15th and 16th data points.

Simple Example (n=20n=20):

Position of Q3Q_3 = Data point at 34(20+1)\frac{3}{4}(20+1) = Data point at 634\frac{63}{4} = Data point at 15.75.

This means Q3Q_3 is between the 15th and 16th data points.

Exercise

Try to find the position of Q1Q_1, Q2Q_2, and Q3Q_3 from the math test scores of these 7 children:

Scores: 7, 5, 8, 6, 9, 7, 10

Step 1: Sort the data first!

Sorted data: 5, 6, 7, 7, 8, 9, 10

Number of data points (nn) = 7

Step 2: Find the quartile positions using the formulas

  • Position of Q1Q_1:

    Data point at 14(n+1)=Data point at 14(7+1)=Data point at 84=Data point at 2\text{Data point at }\frac{1}{4}(n+1) = \text{Data point at }\frac{1}{4}(7+1) = \text{Data point at }\frac{8}{4} = \text{Data point at }2

    The result is a whole number (2), so Q1Q_1 is the 2nd data point.

  • Position of Q2Q_2 (Median):

    Data point at 12(n+1)=Data point at 12(7+1)=Data point at 82=Data point at 4\text{Data point at }\frac{1}{2}(n+1) = \text{Data point at }\frac{1}{2}(7+1) = \text{Data point at }\frac{8}{2} = \text{Data point at }4

    The result is a whole number (4), so Q2Q_2 is the 4th data point.

  • Position of Q3Q_3:

    Data point at 34(n+1)=Data point at 34(7+1)=Data point at 244=Data point at 6\text{Data point at }\frac{3}{4}(n+1) = \text{Data point at }\frac{3}{4}(7+1) = \text{Data point at }\frac{24}{4} = \text{Data point at }6

    The result is a whole number (6), so Q3Q_3 is the 6th data point.

Step 3: Determine the quartile values

Look at the sorted data: 5, 6, 7, 7, 8, 9, 10

  • Q1Q_1 = 2nd data point = 6
  • Q2Q_2 = 4th data point = 7
  • Q3Q_3 = 6th data point = 9

The Fourth Quartile (Q4)

You might be wondering, "If there's Q1Q_1, Q2Q_2, and Q3Q_3, is there a Q4Q_4?"

Technically, the concept of quartiles divides the data into four parts. Q1Q_1 is the boundary for the first 25%, Q2Q_2 (the median) is the 50% boundary, and Q3Q_3 is the 75% boundary. The final boundary, which encompasses 100% of the data, is actually the maximum value of the dataset.

So, while we could refer to the maximum value as "Q4Q_4", in statistical analysis, we don't typically use the term Q4Q_4 explicitly. The main focus is on Q1Q_1, Q2Q_2, and Q3Q_3 because they provide important information about the spread and center of the data in the lower, middle, and upper sections. The minimum value is sometimes called "Q0Q_0", but like Q4Q_4, it's less commonly used than Q1Q_1, Q2Q_2, and Q3Q_3.