I’ve been messing around today with the data on https://ottoneu.fangraphs.com/averageValues and wanted to share something that I’ve developed that is a simple, but potentially helpful, tool for budgeting in advance of auctions. What follows are instructions on how to incorporate these data into a simple model into Excel (or another spreadsheet program) to estimate the probability of a player being available at a given auction price.
For example, I’ve got an auction coming up next week in an existing 5x5 where Charlie Blackmon would be a perfect fit for my budget and need for a star outfielder who isn’t super expensive. What is the probability that I can get him for $25 or less?
First, download the CSV for the game type that corresponds to your league (do not click on the first year draft only button). The next step will be to estimate the standard deviation for the player. Unfortunately, minimum values are not that helpful because leagues differ in their keeper rules (e.g., Acuna and Soto are both still $7 each in at least one league). Fortunately, we can estimate the standard deviation using just the median (thank you Niv!) and maximum values. Use the median instead of the mean to try to minimize the effect of outliers (although the mean and median are typically very close).
Next, in Cell M1, type “StDev.” Then in Cell M2, type “=-0.42+0.42*(j2-h2)”
That is, we are subtracting the Median from the Max, multiplying that difference by 0.42, and then subtracting 0.42 (note: the fact that they are both 0.42 is mere coincidence as is the fact that 42 is a number with special significance in baseball). After entering the formula into M2, drag or double-click the lower right corner down to populate the players below.
Next, in Cell N1 type “Prob<$25” (that is, we will be calculating the probability of getting each player for less than $20). In Cell N2, type “=normdist(25,k2,m2,true)” That is, you’re calculating the cumulative probability of $25 given the Last 10 Adds as your mean and the estimated standard deviation. Note: I’m using the Last 10 Adds rather than the overall mean or median here because that most likely reflects his current market value and is more robust against keeper-induced outliers). After entering the formula into N2, drag or double-click the lower right corner down to populate the players below.
If you downloaded the 5x5 dataset (as of when I downloaded it this evening), then you’ll get an estimated standard deviation for Blackmon of $2.52 and if you insert that into the formula (including his last 10 add salary of $27), then you get a probability of 21.4% for getting him at $25 or less.
Or you might find it to be more helpful to construct 90% confidence intervals for purposes of planning. In that case, then the confidence interval can be constructed using the Last 10 Adds and then plus or minus 2*StDev. So for Blackmon in 5x5, the 90% confidence interval would be $27 +/-$5.04 (i.e., [$22,$32]).
In preparing for drafts, I create a “model team” where I budget for a <$5 infielder, <$25 OF, <$15 SP, etc. and then use that to construct my “shopping lists” based on the players available around those price points (hoping to get guys below in order to bank surplus). I’m going to use this as a robustness check for my model team to be a little bit more realistic about how much I should expect players to actually cost.
It’s not perfect. Some limitations include (but are not limited to):
- the standard deviation estimate is not as precise as I would want it to be;
- normality is a strong assumption in such a finite sample;
- it doesn’t really work for players who weren’t established last year (yet–for example, Fernando Tatis, Jr.); and
- the existence of keepers still distorts the market somewhat.
But hopefully what I have laid out provides a little bit more insight than just looking at the Last 10 Adds and assuming all of them are plus or minus the same fudge factor for all players. Because the estimated standard deviations do vary for players.
For those in non-Ottoneu draft (i.e., non-auction leagues), I laid out a similar process for using NFBC data to construct probabilities of a player being available at a given draft position using a similar approach in a Reddit post earlier today.