Preston Jensen

Powerlifting Percentiles

Cover Image for Powerlifting Percentiles

Introduction

In the realm of powerlifting, the measure of an athlete’s prowess is distilled down to their total – the cumulative weight of their best squat, bench press, and deadlift. However, this raw total only scratches the surface of the competitive landscape. It’s obviously far more impressive for a 57kg/125lbs woman to lift 400kg/880lbs than it is for a 93kg/205lbs man to total the same. As a data scientist and a powerlifter, I’ve developed a tool that offers a deeper insight into how to quantify this. This tool isn’t just about numbers; it’s about context, enabling athletes to gauge their performance not just in isolation but against a broader competitive field, considering crucial factors such as bodyweight and sex.

Background

Powerlifting is a sport of nuance, where athletes compete within categories defined by bodyweight and sex to ensure fairness and equity. However, comparing performance across these categories or even within them can be complex due to the diversity of human physiques and strength capabilities. Traditional metrics like DOTS, Wilks, and IPF GL points provide a solution, but they often seem abstract — what exactly does a 400 DOTS mean? They transform performances into scores that, while useful for competition, lack intuitive meaning for many athletes and coaches.

Recognizing this gap, my aim was to create a tool that transcends traditional scoring systems, offering a percentile-based analysis of a powerlifter’s total. This approach is grounded in statistical analysis, providing a clear, intuitive understanding of where an athlete stands among their peers. For instance, knowing that a 600kg/1322lbs total at a bodyweight of 90kg/198lbs places a male athlete in the top 30% of competitors offers a more relatable sense of achievement than a score or coefficient might.

This tool is designed for powerlifters who seek a deeper understanding of their competitive standing, coaches aiming to benchmark their athletes, and enthusiasts who enjoy dissecting the analytics behind the sport. By inputting sex, bodyweight, and total lift, users receive a percentile ranking that reflects their performance in relation to a comprehensive database of powerlifting outcomes. This not only helps in setting realistic goals and strategies but also fosters a greater appreciation for the nuances of strength sports.

Give it a try here: http://powerlifting-percentiles.vercel.app

Let’s dig in

Using the publicly available powerlifting dataset from https://data.openpowerlifting.org/, we can start digging into all kinds of data about powerlifting competitions. The first thing that’s useful to look at is what the distribution of totals looks like for a given sex and bodyweight. Below are the distributions for men and women from bodyweights of 50kg/110lbs to 120kg/265lbs.

You’ll notice that a normal distribution does a really good job of modeling the distribution of totals for each of the given sex/bodyweight pairs. Going into this, I was expecting there to be longer tails, especially on the right side because I am not filtering out non-tested competitions. Nevertheless, the normal distribution does a great job. It’s always cool when numbers just work like that.

We already have everything we need to calculate the percentiles for different totals — we’d just need to compute the mean and standard deviation for every bodyweight in the dataset, which is totally doable since the dataset has plenty of data on a wide range of body weights. We could simply look at each kilogram of bodyweight from 40kg up to 200kg, and compute the mean and standard deviation for males and females. This would give us (160 unique bodyweights) * (2 sexes) * (2 parameters) = 640 numbers. 

However, this sucks and is ugly. The beauty of statistics is that you can take many numbers and model them using few numbers. So let’s keep going.

What would be nice at this point is a way to predict the mean and standard deviation based on sex and bodyweight. Let’s see if we can do that. I’ll start by plotting the mean and standard deviation for males and females based on bodyweight.

We can work with this. The plots for the mean looks to be logarithmic. This would make sense, as relative strength tends to decrease at higher and higher bodyweight. Let’s see if we can fit a curve of the type f(x) = a * ln(b * X + c) to these data points. 

That’s looking really good! Now, let’s do something similarly with the standard deviations. The standard deviations tend to increase with respect to bodyweight, but it doesn’t look linear. There’s a slight bend in it making it get bigger more rapidly at higher body weights. Let’s try fitting a polynomial function of degree 3 to the data.

That’s pretty good! Now, for each sex, we have just 3 numbers to calculate the mean, and 4 to calculate the standard deviation, giving us 14 total. That’s far fewer than the 640 from before and has the added benefit of smoothing out the data for us. 

Piecing it all together

Now that we have a model for the mean and standard deviation based on bodyweight, we can answer our original question of calculating the percentile a given total is based on bodyweight and sex. Here’s how you’d do it in Python:

import numpy as np  
from scipy.stats import norm  
  
def mean_curve_func(x, A, B, C):  
	return A * np.log(B * x + C)  
  
def std_curve_func(x, A, B, C, D):  
	return A + B * x + C * x ** 2 + D * x ** 3  
  
def calculate_percentile(total, bodyweight, sex):  
	if sex == 'M':  
		a_mean, b_mean, c_mean, a_std, b_std, c_std, d_std = (172.60089846052378, 0.4968538227117669, -20.750205863276193, 72.0417687388453, -0.09976586871208123, 0.004811979334739356, -9.2217747522473e-06)  
	elif sex == 'F':  
		a_mean, b_mean, c_mean, a_std, b_std, c_std, d_std = (82.83876086730504, 1.1234655444197217, -36.36091255869513, -29.441592676328575, 2.822549893189308, -0.028338224120590404, 0.00010359303020098536)  
	else:  
		return (calculate_percentile(total, bodyweight, 'M') + calculate_percentile(total, bodyweight, 'F')) / 2  
	  
	mean = mean_curve_func(bodyweight, a_mean, b_mean, c_mean)  
	std = std_curve_func(bodyweight, a_std, b_std, c_std, d_std)  
	  
	cdf_value = norm.cdf(total, loc=mean, scale=std)  
	percentile_score = cdf_value * 100  
	return percentile_score

Note: The values you see above are based on male body weights between 50kg and 180kg, and female body weights between 40kg and 156kg. This is where most of the data is. Beyond those numbers, the predictions might get weird.

How does this compare to DOTS, Wilks, and IPF GL points?

DOTS, Wilks, and IPF GL points are all calculated by multiplying the total by some coefficient. For DOTS and Wilks, these coefficients are inverse polynomials (k / (a + bbodyweight^1 + cbodyweight^2 + …)). For IPF GL, the coefficient is calculated using an exponential (k / (a – (b * np.exp(-c * bodyweight)))).The exact parameters were determined statistically for each method. 

Looking at how they all compare, we can see they all correlate with each other. The notable thing here is that IPF GL points correlate the least with the other methods. 

To get a sense of what this means for athletes, here are the top athletes on Open Powerlifting according to this percentile method, along with their DOTS, Wilks, and IPF GL scores.

Name Sex TotalKg BodyweightKg Percentile DOTS IPF_GL Wilks2
Kristy Hawkins F 687.5 73.6 99.99999889 676.3649699 137.6691834 807.3500652
Samantha Rice F 702.5 84.45 99.99999352 644.7747172 132.3909801 778.6499139
Chakera Ingram F 692.5 81.5 99.99999292 646.5530622 132.37712 777.9553236
Marianna Gasparyan F 580 57.7 99.99998561 659.2404949 134.8196479 785.5358158
John Haack M 1022.5 89.9 99.99996847 661.5198215 136.0079451 784.704424
Hunter Henderson #1 F 670 81.4 99.99996068 625.9219259 128.1412635 753.0418489
Agata Sitko F 600 68.9 99.99971785 612.1988041 124.4045128 728.5908302
Amanda Lawrence #1 F 647 83.51 99.99968344 597.0001089 122.4662853 720.0980709
Sarah Lewis #1 F 597.5 68.7 99.99968057 610.6562174 124.0875774 726.69048
Stefanie Cohen F 525 54.8 99.99967016 617.3928361 126.947125 737.7863221
Prescillia Bavoil F 585 66.46 99.99959918 609.421842 123.8323692 724.672949
Brianny Terry F 620 75.8 99.99959039 600.5065677 122.3891255 718.1649633
Carola Garra F 582.5 67.17 99.99940009 603.0764455 122.5373256 717.2647958
Blake Lehew M 915 82.3 99.99930696 620.6683121 127.2039382 737.0959947
Jawon Garrison M 915.5 82.5 99.99927919 620.1529464 127.1172106 736.4631197
Karlina Tongotea F 610.5 75.6 99.99924652 592.1243591 120.6649126 708.0097076
Ashley Contorno F 600 73 99.99915438 592.8923493 120.6423087 707.3821176
April Mathis F 705.34 113.4 99.99911051 575.6468577 122.169002 720.5754021
Kiersten Scurlock F 677.5 102.8 99.99904305 572.0828223 119.960434 708.0951469
Susan Salazar F 540.5 60 99.998912 599.1689101 122.1746983 712.9371962
Austin Perkins #1 M 851 74.24 99.99890943 614.5368092 124.7255278 730.2852624
Andrzej Stanaszek M 600 51.3 99.99883793 582.152431 107.0338839 676.3199621
Jade Jacob F 519.5 56.57 99.99876659 598.1520425 122.5546881 713.4353175
K’Aunica Byrd F 567.5 66.3 99.99866166 592.0254095 120.2996737 703.9607689
Natalie Richards #1 F 516.5 56.36 99.99859279 596.1558462 122.1920582 711.197074
Jamal Browner M 1052.5 109.4 99.99854768 624.650732 127.5091682 740.2865278
Taylor Atwood M 838.5 73.63 99.99836106 608.7664483 123.4208063 723.4173207
Yelena Espinoza F 507.5 55.4 99.99821838 592.4788791 121.6665083 707.5192574
Alana Hynes D’Aquino F 567.5 67.4 99.99816496 586.385427 119.1455463 697.4635609
Barbara Lee #1 F 567.5 67.4 99.99816496 586.385427 119.1455463 697.4635609
Stacy Burr F 565 66.9 99.99810812 586.3274069 119.1354545 697.2898861
Sherine Marcelle F 633.5 89.2 99.99801303 567.0205375 117.0183273 689.085185
Evie Corrigan F 481 51.85 99.99779858 587.4710361 121.7581506 705.0633366
Brittany Bowles F 527.5 59.75 99.99746032 586.3029417 119.5833862 697.7130429
Jessica Springer F 687.5 113.9 99.99736365 560.304642 118.9762188 701.6626694
Meghan Scanlon F 542.5 63 99.99726528 583.4668646 118.6898673 693.6525644
Tess Heaslip F 587.5 74.7 99.99716877 573.4269642 116.7890919 685.1031833
Alisha Luna F 475 51.5 99.99711682 582.8913148 120.9462342 699.9967763
Daniella Melo F 613 83.55 99.99709663 565.4983105 116.008709 682.1349969
Jenn Rotsinger F 465 50.2 99.99705247 580.9970924 121.1121833 699.4596024
Andrea Armstrong F 510 57.1 99.99686027 583.6353396 119.4715278 695.7884537
Viktoriya Ilieva F 592.5 77.5 99.99640777 567.3140859 115.7633975 679.5936319
Jessica Buettner F 585 75.25 99.99628754 568.7775928 115.8812638 679.8777718
Chad Penson M 925 90 99.99612693 598.1057663 122.9715227 709.4713579
Jesse Norris M 922.5 89.72 99.99597167 597.4278642 122.8278584 708.6963879
Chloe Lansing F 555 67.3 99.99588536 573.962045 116.6213647 682.66511
Allison Hind F 580.6 74.57 99.995734 567.2162148 115.5152526 677.607025
Tiffany Chapon F 431.5 46.77 99.99566294 567.5658408 120.1975943 688.9587868
Amanda Smith #5 F 575 73 99.99562297 568.1885014 115.6155458 677.9078627
Terri Ashley F 595 79.5 99.99557729 562.3969365 114.9435291 675.1403831

Conclusion

Wrapping up, the development of this percentile calculator for powerlifters is a straightforward step towards making sense of how totals stack up across different weight classes and sexes. It’s a practical tool that helps athletes, coaches, and enthusiasts get a quick glimpse into where a lifter’s performance sits on the broader spectrum without diving into complex formulas or coefficients.

Unlike traditional metrics like DOTS, Wilks, and IPF GL points, which serve their purpose in scoring and standardizing competition performances, this tool offers a simpler, more intuitive way to understand one’s standing. By translating totals into percentiles, it directly connects an athlete’s efforts to a position within a defined peer group, making the numbers more relatable and understandable.

In conclusion, this percentile calculator is a modest contribution aimed at making data a bit more accessible and meaningful for the powerlifting community. Whether you’re comparing your progress, setting new targets, or just satisfying your curiosity about where you stand, it’s here to provide those insights in a simple and straightforward manner.

PS, all the code for these calculations is available on GitHub here: https://github.com/prestoj/powerlifting-percentiles-analysis