Choosing a subset of flops to represent the whole game
There are 22100 possible flops in Holdem out of which 1755 are strategically different. This is quite a big number which makes attempts to approximate preflop EV for chosen spots as well as upcoming preflop solving quite a difficult task. One needs a few terabytes of RAM to fit even very simple games (assuming full postflop play) and using a disc storage slows things down very significantly.
It's no surprise then that players and programmers are attracted to the idea of simplifying the game a bit. One of the natural ideas is to reduce the number of flops from 1755 to something more manageable hoping that the preflop results stays approximately the same. First publicly known attempt to do that was described by Will Tipton, here: 2p2 discussion about subset of flops.
Tipton's method was based on first creating conditions a good subset must satisfy and then finding the minimum size subset which satisfies all them. Example conditions are a frequency of every card appearing, a frequency of any given pair being a top pair etc.
While this method makes sense and was improved upon by others since the original publications we have chosen a bit different road. Our method is based on defining some metrics which a good subset must satisfy and then running a solver of sorts to find the best subsets of N elements which scores the best on the metric. The metrics used are equity (against full range, against 50% of the range, against AA etc.) as well as EVs from all 1755 sets which we got access to thanks to several of our users who run high volume analysis before (see credits at the bottom of this post).
The algorithm starts from a random set and "evolves" at every iteration. We have used a random walk approach - at every step the subset is mutated in some ways and if the improvement is found that new subset becomes a new current one - rinse and repeat. The real EV results we got were divided into a training set and a testing set to avoid a situation where the same data is used for both training and testing.
We tried many metrics trying to determine the best one. Interestingly it seems a mix of EV and EQ performs better than other even if we grade the set using EV only.
The results we got are quite promising. To measure how good a subset of flops is we have used least square measure, that is a sum of squares of EV differences for every possible hand. That method punishes big deviations which is what we want. We got big improvements over Tipton's method (which contains 103 flops). Tipton's subset is performing on par with our 25 element subsets and signficantly worse than 50+ element subsets.
Without further ado let's go to the benchmarks. You can find comparison of our subsets to the real results (ones calculated on all 1755 flops) below. We are presenting 5 subsets we've developed: 25 element one, 49 element one, 75 element one, 95 element one as well as 184 element one. Additionally original Tipton's subset is added to the comparison.
- Equity against full range
- IP players in 3-bet pot, BTNvsBB; 100bb
- IP player in single raised pot, BTNvsBB 6max, 100bb
- OOP player in 3-bet pot, BTNvsBB 100bb
- OOP player in single raised pot, BTNvsBB 6max, 100bb
- IP player in single raised pot, SBvsBB 6max, 100bb
- OOP player in single raised pot, SBvsBB 6max, 100bb
It seems that 184 element subset performs really well but the smaller one should offer very good accuracy when the goal is get preflop EV, adjust, repeat. We've uploaded more benchmarks, HERE.
We hope making those subset available will make estimating preflop EVs faster and more productive process. We hope those subsets can also be used to obtain preflop solutions once the preflop solver is available. Preliminary tests look very promising.
Have fun!
Thanks
The testing data was provided by our very helpful users, among others:
- Selcouth
- Ilya "SM0LK0" Smolko
Update
Since posting this blog entry we have improved on the process and let it run on our server for much longer. The flop subsets included in preflop_subsets folder in PioSolver installation perform much better than those from the article and include larger variety of sizes from 3 to 362 flops.