Powerlifting Percentiles

February 14, 2024

Introduction

In the realm of powerlifting, the measure of an athlete’s prowess is distilled down to their total – the cumulative weight of their best squat, bench press, and deadlift. However, this raw total only scratches the surface of the competitive landscape. It’s obviously far more impressive for a 57kg/125lbs woman to lift 400kg/880lbs than it is for a 93kg/205lbs man to total the same. As a data scientist and a powerlifter, I’ve developed a tool that offers a deeper insight into how to quantify this. This tool isn’t just about numbers; it’s about context, enabling athletes to gauge their performance not just in isolation but against a broader competitive field, considering crucial factors such as bodyweight and sex.

Background

Powerlifting is a sport of nuance, where athletes compete within categories defined by bodyweight and sex to ensure fairness and equity. However, comparing performance across these categories or even within them can be complex due to the diversity of human physiques and strength capabilities. Traditional metrics like DOTS, Wilks, and IPF GL points provide a solution, but they often seem abstract — what exactly does a 400 DOTS mean? They transform performances into scores that, while useful for competition, lack intuitive meaning for many athletes and coaches.

Recognizing this gap, my aim was to create a tool that transcends traditional scoring systems, offering a percentile-based analysis of a powerlifter’s total. This approach is grounded in statistical analysis, providing a clear, intuitive understanding of where an athlete stands among their peers. For instance, knowing that a 600kg/1322lbs total at a bodyweight of 90kg/198lbs places a male athlete in the top 30% of competitors offers a more relatable sense of achievement than a score or coefficient might.

This tool is designed for powerlifters who seek a deeper understanding of their competitive standing, coaches aiming to benchmark their athletes, and enthusiasts who enjoy dissecting the analytics behind the sport. By inputting sex, bodyweight, and total lift, users receive a percentile ranking that reflects their performance in relation to a comprehensive database of powerlifting outcomes. This not only helps in setting realistic goals and strategies but also fosters a greater appreciation for the nuances of strength sports.

Give it a try here: http://powerlifting-percentiles.vercel.app

Let’s dig in

Using the publicly available powerlifting dataset from https://data.openpowerlifting.org/, we can start digging into all kinds of data about powerlifting competitions. The first thing that’s useful to look at is what the distribution of totals looks like for a given sex and bodyweight. Below are the distributions for men and women from bodyweights of 50kg/110lbs to 120kg/265lbs.

You’ll notice that a normal distribution does a really good job of modeling the distribution of totals for each of the given sex/bodyweight pairs. Going into this, I was expecting there to be longer tails, especially on the right side because I am not filtering out non-tested competitions. Nevertheless, the normal distribution does a great job. It’s always cool when numbers just work like that.

We already have everything we need to calculate the percentiles for different totals — we’d just need to compute the mean and standard deviation for every bodyweight in the dataset, which is totally doable since the dataset has plenty of data on a wide range of body weights. We could simply look at each kilogram of bodyweight from 40kg up to 200kg, and compute the mean and standard deviation for males and females. This would give us (160 unique bodyweights) * (2 sexes) * (2 parameters) = 640 numbers.

However, this sucks and is ugly. The beauty of statistics is that you can take many numbers and model them using few numbers. So let’s keep going.

What would be nice at this point is a way to predict the mean and standard deviation based on sex and bodyweight. Let’s see if we can do that. I’ll start by plotting the mean and standard deviation for males and females based on bodyweight.

We can work with this. The plots for the mean looks to be logarithmic. This would make sense, as relative strength tends to decrease at higher and higher bodyweight. Let’s see if we can fit a curve of the type f(x) = a * ln(b * X + c) to these data points.

That’s looking really good! Now, let’s do something similarly with the standard deviations. The standard deviations tend to increase with respect to bodyweight, but it doesn’t look linear. There’s a slight bend in it making it get bigger more rapidly at higher body weights. Let’s try fitting a polynomial function of degree 3 to the data.

That’s pretty good! Now, for each sex, we have just 3 numbers to calculate the mean, and 4 to calculate the standard deviation, giving us 14 total. That’s far fewer than the 640 from before and has the added benefit of smoothing out the data for us.

Piecing it all together

Now that we have a model for the mean and standard deviation based on bodyweight, we can answer our original question of calculating the percentile a given total is based on bodyweight and sex. Here’s how you’d do it in Python:

import numpy as np  
from scipy.stats import norm  
  
def mean_curve_func(x, A, B, C):  
	return A * np.log(B * x + C)  
  
def std_curve_func(x, A, B, C, D):  
	return A + B * x + C * x ** 2 + D * x ** 3  
  
def calculate_percentile(total, bodyweight, sex):  
	if sex == 'M':  
		a_mean, b_mean, c_mean, a_std, b_std, c_std, d_std = (172.60089846052378, 0.4968538227117669, -20.750205863276193, 72.0417687388453, -0.09976586871208123, 0.004811979334739356, -9.2217747522473e-06)  
	elif sex == 'F':  
		a_mean, b_mean, c_mean, a_std, b_std, c_std, d_std = (82.83876086730504, 1.1234655444197217, -36.36091255869513, -29.441592676328575, 2.822549893189308, -0.028338224120590404, 0.00010359303020098536)  
	else:  
		return (calculate_percentile(total, bodyweight, 'M') + calculate_percentile(total, bodyweight, 'F')) / 2  
	  
	mean = mean_curve_func(bodyweight, a_mean, b_mean, c_mean)  
	std = std_curve_func(bodyweight, a_std, b_std, c_std, d_std)  
	  
	cdf_value = norm.cdf(total, loc=mean, scale=std)  
	percentile_score = cdf_value * 100  
	return percentile_score

Note: The values you see above are based on male body weights between 50kg and 180kg, and female body weights between 40kg and 156kg. This is where most of the data is. Beyond those numbers, the predictions might get weird.

How does this compare to DOTS, Wilks, and IPF GL points?

DOTS, Wilks, and IPF GL points are all calculated by multiplying the total by some coefficient. For DOTS and Wilks, these coefficients are inverse polynomials (k / (a + bbodyweight^1 + cbodyweight^2 + …)). For IPF GL, the coefficient is calculated using an exponential (k / (a – (b * np.exp(-c * bodyweight)))).The exact parameters were determined statistically for each method.

Looking at how they all compare, we can see they all correlate with each other. The notable thing here is that IPF GL points correlate the least with the other methods.

To get a sense of what this means for athletes, here are the top athletes on Open Powerlifting according to this percentile method, along with their DOTS, Wilks, and IPF GL scores.


Name	Sex	TotalKg	BodyweightKg	Percentile	DOTS	IPF_GL	Wilks2
Kristy Hawkins	F	687.5	73.6	99.99999889	676.3649699	137.6691834	807.3500652
Samantha Rice	F	702.5	84.45	99.99999352	644.7747172	132.3909801	778.6499139
Chakera Ingram	F	692.5	81.5	99.99999292	646.5530622	132.37712	777.9553236
Marianna Gasparyan	F	580	57.7	99.99998561	659.2404949	134.8196479	785.5358158
John Haack	M	1022.5	89.9	99.99996847	661.5198215	136.0079451	784.704424
Hunter Henderson #1	F	670	81.4	99.99996068	625.9219259	128.1412635	753.0418489
Agata Sitko	F	600	68.9	99.99971785	612.1988041	124.4045128	728.5908302
Amanda Lawrence #1	F	647	83.51	99.99968344	597.0001089	122.4662853	720.0980709
Sarah Lewis #1	F	597.5	68.7	99.99968057	610.6562174	124.0875774	726.69048
Stefanie Cohen	F	525	54.8	99.99967016	617.3928361	126.947125	737.7863221
Prescillia Bavoil	F	585	66.46	99.99959918	609.421842	123.8323692	724.672949
Brianny Terry	F	620	75.8	99.99959039	600.5065677	122.3891255	718.1649633
Carola Garra	F	582.5	67.17	99.99940009	603.0764455	122.5373256	717.2647958
Blake Lehew	M	915	82.3	99.99930696	620.6683121	127.2039382	737.0959947
Jawon Garrison	M	915.5	82.5	99.99927919	620.1529464	127.1172106	736.4631197
Karlina Tongotea	F	610.5	75.6	99.99924652	592.1243591	120.6649126	708.0097076
Ashley Contorno	F	600	73	99.99915438	592.8923493	120.6423087	707.3821176
April Mathis	F	705.34	113.4	99.99911051	575.6468577	122.169002	720.5754021
Kiersten Scurlock	F	677.5	102.8	99.99904305	572.0828223	119.960434	708.0951469
Susan Salazar	F	540.5	60	99.998912	599.1689101	122.1746983	712.9371962
Austin Perkins #1	M	851	74.24	99.99890943	614.5368092	124.7255278	730.2852624
Andrzej Stanaszek	M	600	51.3	99.99883793	582.152431	107.0338839	676.3199621
Jade Jacob	F	519.5	56.57	99.99876659	598.1520425	122.5546881	713.4353175
K’Aunica Byrd	F	567.5	66.3	99.99866166	592.0254095	120.2996737	703.9607689
Natalie Richards #1	F	516.5	56.36	99.99859279	596.1558462	122.1920582	711.197074
Jamal Browner	M	1052.5	109.4	99.99854768	624.650732	127.5091682	740.2865278
Taylor Atwood	M	838.5	73.63	99.99836106	608.7664483	123.4208063	723.4173207
Yelena Espinoza	F	507.5	55.4	99.99821838	592.4788791	121.6665083	707.5192574
Alana Hynes D’Aquino	F	567.5	67.4	99.99816496	586.385427	119.1455463	697.4635609
Barbara Lee #1	F	567.5	67.4	99.99816496	586.385427	119.1455463	697.4635609
Stacy Burr	F	565	66.9	99.99810812	586.3274069	119.1354545	697.2898861
Sherine Marcelle	F	633.5	89.2	99.99801303	567.0205375	117.0183273	689.085185
Evie Corrigan	F	481	51.85	99.99779858	587.4710361	121.7581506	705.0633366
Brittany Bowles	F	527.5	59.75	99.99746032	586.3029417	119.5833862	697.7130429
Jessica Springer	F	687.5	113.9	99.99736365	560.304642	118.9762188	701.6626694
Meghan Scanlon	F	542.5	63	99.99726528	583.4668646	118.6898673	693.6525644
Tess Heaslip	F	587.5	74.7	99.99716877	573.4269642	116.7890919	685.1031833
Alisha Luna	F	475	51.5	99.99711682	582.8913148	120.9462342	699.9967763
Daniella Melo	F	613	83.55	99.99709663	565.4983105	116.008709	682.1349969
Jenn Rotsinger	F	465	50.2	99.99705247	580.9970924	121.1121833	699.4596024
Andrea Armstrong	F	510	57.1	99.99686027	583.6353396	119.4715278	695.7884537
Viktoriya Ilieva	F	592.5	77.5	99.99640777	567.3140859	115.7633975	679.5936319
Jessica Buettner	F	585	75.25	99.99628754	568.7775928	115.8812638	679.8777718
Chad Penson	M	925	90	99.99612693	598.1057663	122.9715227	709.4713579
Jesse Norris	M	922.5	89.72	99.99597167	597.4278642	122.8278584	708.6963879
Chloe Lansing	F	555	67.3	99.99588536	573.962045	116.6213647	682.66511
Allison Hind	F	580.6	74.57	99.995734	567.2162148	115.5152526	677.607025
Tiffany Chapon	F	431.5	46.77	99.99566294	567.5658408	120.1975943	688.9587868
Amanda Smith #5	F	575	73	99.99562297	568.1885014	115.6155458	677.9078627
Terri Ashley	F	595	79.5	99.99557729	562.3969365	114.9435291	675.1403831

Conclusion

Wrapping up, the development of this percentile calculator for powerlifters is a straightforward step towards making sense of how totals stack up across different weight classes and sexes. It’s a practical tool that helps athletes, coaches, and enthusiasts get a quick glimpse into where a lifter’s performance sits on the broader spectrum without diving into complex formulas or coefficients.

Unlike traditional metrics like DOTS, Wilks, and IPF GL points, which serve their purpose in scoring and standardizing competition performances, this tool offers a simpler, more intuitive way to understand one’s standing. By translating totals into percentiles, it directly connects an athlete’s efforts to a position within a defined peer group, making the numbers more relatable and understandable.

In conclusion, this percentile calculator is a modest contribution aimed at making data a bit more accessible and meaningful for the powerlifting community. Whether you’re comparing your progress, setting new targets, or just satisfying your curiosity about where you stand, it’s here to provide those insights in a simple and straightforward manner.

PS, all the code for these calculations is available on GitHub here: https://github.com/prestoj/powerlifting-percentiles-analysis

Preston Jensen