Source codeVideos

Command Palette

Search for a command to run...

Statistics

Percentiles for Grouped Data

What Are Percentiles?

You're already familiar with quartiles, which divide data into 4 equal parts, right? Well, percentiles are like quartiles' sibling, but they're even more detailed!

If quartiles divide data into 4 chunks, percentiles divide ordered data into 100 equal chunks. That's a lot, huh? Like dividing a chocolate bar into 100 tiny squares.

Each chunk is separated by a percentile value. There are 99 percentile values, starting from P1P_1, P2P_2, P3P_3, ..., up to P99P_{99}.

  • P10P_{10} (10th Percentile) means this value separates the smallest 10% of the data from the remaining 90%.
  • P50P_{50} (50th Percentile) is exactly the same as the Median or the Second Quartile (Q2Q_2 ), because it divides the data right in the middle (50% below, 50% above).
  • P85P_{85} (85th Percentile) means this value separates the smallest 85% of the data from the largest 15%.

Percentiles are very useful for seeing the position of a specific value relative to the entire dataset, like class rankings for test scores or a child's growth compared to peers of the same age.

How to Find Percentile Values for Grouped Data

Just like finding quartiles for grouped data, we also use interpolation to find the value of a percentile (PiP_i) when the data is grouped.

The steps are very similar:

Find the Percentile Class Position

First, we determine which data point corresponds to the i-th percentile. The formula is:

Position of Pi=the i100×n-th data point\text{Position of } P_i = \text{the } \frac{i}{100} \times n \text{-th data point}
  • ii = Which percentile are we looking for? (e.g., 10, 50, 85)
  • nn = Total number of data points

Once we have the position, we look at the cumulative frequency table (FkF_k) to find out which class interval this percentile falls into.

Calculate the Percentile Value using the Interpolation Formula

Once we know the class, we use this magic interpolation formula:

Pi=Tb+(i100nFkumfi)pP_i = T_b + \left( \frac{\frac{i}{100}n - F_{kum}}{f_i} \right) p

Where:

  • PiP_i = Value of the i-th Percentile (what we're looking for)
  • TbT_b = Lower boundary of the i-th percentile class
  • ii = Which percentile (e.g., 10, 85)
  • nn = Total frequency
  • FkumF_{kum} = Cumulative frequency BEFORE the i-th percentile class
  • fif_i = Frequency of the i-th percentile class
  • pp = Class width

Notice, the formula is very similar to the quartile formula, the only difference is the i100n\frac{i}{100}n part (quartiles use i4n\frac{i}{4}n).

Finding Math Test Scores

For example, let's say we have the math test scores of 40 students:

Test ScoreFrequency (ff)Cumulative Frequency (FkF_k)Lower Boundary (TbT_b)Class Width (pp)
61-704460.5\leq 60.510
71-80101470.5\leq 70.510
81-90163080.5\leq 80.510
91-100104090.5\leq 90.510
Total40

We want to find the value of the 85th Percentile (P85P_{85}).

  1. Find the Position of P85P_{85}:

    Position of P85P_{85} = the 85100×40=3400100=34\frac{85}{100} \times 40 = \frac{3400}{100} = 34-th data point.

  2. Determine the Class of P85P_{85}:

    Look at the FkF_k column. Where is the 34th data point? The 81-90 class has Fk=30F_k = 30 (not enough). The 91-100 class has Fk=40F_k = 40 (data points 31 through 40 are here). So, the P85P_{85} class is 91-100.

  3. Gather Ingredients for the Formula:

    • TbT_b (Lower boundary of class 91-100) = 90.5
    • ii = 85
    • nn = 40
    • FkumF_{kum} (Cumulative frequency before class 91-100) = 30
    • f85f_{85} (Frequency of class 91-100) = 10
    • pp (Class width) = 10
  4. Calculate P85P_{85}:

    P85=Tb+(85100nFkumf85)pP_{85} = T_b + \left( \frac{\frac{85}{100}n - F_{kum}}{f_{85}} \right) p
    P85=90.5+(343010)10P_{85} = 90.5 + \left( \frac{34 - 30}{10} \right) 10
    P85=90.5+(410)10P_{85} = 90.5 + \left( \frac{4}{10} \right) 10
    P85=90.5+4P_{85} = 90.5 + 4
    P85=94.5P_{85} = 94.5

So, the 85th Percentile value is 94.5. This means 85% of the students scored 94.5 or less, and 15% scored above 94.5.

Exercise

Try calculating the value of the 20th Percentile (P20P_{20}) from the math test score data above!

Answer Key

  1. Position of P20P_{20}:

    Position of P20P_{20} = the 20100×40=800100=8\frac{20}{100} \times 40 = \frac{800}{100} = 8-th data point.

  2. Class of P20P_{20}:

    Look at FkF_k. The 8th data point is in the 71-80 class (because the previous class's FkF_k is 4, and this class's FkF_k is 14).

  3. Formula Ingredients:

    • TbT_b = 70.5
    • ii = 20
    • nn = 40
    • FkumF_{kum} = 4
    • f20f_{20} = 10
    • pp = 10
  4. Calculate P20P_{20}:

    P20=Tb+(20100nFkumf20)pP_{20} = T_b + \left( \frac{\frac{20}{100}n - F_{kum}}{f_{20}} \right) p
    P20=70.5+(8410)10P_{20} = 70.5 + \left( \frac{8 - 4}{10} \right) 10
    P20=70.5+(410)10P_{20} = 70.5 + \left( \frac{4}{10} \right) 10
    P20=70.5+4P_{20} = 70.5 + 4
    P20=74.5P_{20} = 74.5

The 20th Percentile value is 74.5.