Friday, October 27, 2006

sklansky's theorem: fundamentally flawed (i)

David Sklansky’s book, The Theory of Poker, is highly revered but, though I own it, I have read but a little. However, I am familiar with his Fundamental Theorem of Poker (FToP), which states:

‘Every time you play a hand differently from the way you would have played it if you could see all of your opponent’s cards, they gain; and every time you play your hand the same way you would have played it if you could see all their cards, you gain. Conversely, every time opponents play their hands differently from the way they would have if they could see all your cards you gain; and every time they play their hands the same way they would have played if they could see all your cards, you lose.’

The first criticism isn’t technical, but one of communication, it lies in the implication of ‘gain’. The usage is ambiguous: the intended interpretation is reliant on knowledge not requisite for engaging poker or much of its material. In reality you wouldn’t gain every time you executed a perfectly informed decision - that’s poker. So we must assume Sklansky attempts to communicate gain in the EV [1], or average, sense. However, even accepting this interpretation of ‘gain’ the theorem merits a challenge. It is easy to construct a hand where the same strategies would be employed by players whether they are all covertly aware of the each other’s cards or not. Going all zero-sum, are they all gaining, as the theorem would suggest? Well not in an EV sense (ordinarily).

Perhaps the theorem could fence off criticism under such contrived circumstances by claiming careful interpretation suggests they are both losing and gaining thus facilitating a neutral outcome; well, perhaps. The theorem might seem more robust - certainly clearer – if, for example, Sklansky replaced ‘lose’ with ‘can’t gain’ – in the EV sense, of course. Clearer, robust, perhaps: less memorable, definitely.

Moving on, the theorem appears to advance or at least encourage the notion that resolving the optimal play requires only knowledge of your opponents’ cards; after all, you know his holding, what more should you demand? Simple: his strategy. We’ve all borne the frustration of discovering our foe’s failure to act as we’d anticipated (or as pro’s are prone to lament, act as they should) despite correctly establishing their hand. Knowing a player’s holding could lead us to check expecting a bet, bet expecting a fold or raise only to be disappointed. Thus, exploiting this information would inevitably, on occasion, direct one to an incorrect play, where ignorance would not. If we are to reason our opponent’s decision ultimately determinable then our judgement is culpable at some level in such instances, and therefore the theorem won’t hold.

Alternatively, a non-deterministic view of our opponents’ strategy could on occasion see misjudgement become the scapegoat for misfortune. It might be considered unlucky to check a monster to find our foe choosing to take this opportunity not to value bet top pair. Despite not explicitly gaining, the theorem, under a typical interpretation, might stand up because check-raising could be the best play against his mixed strategy; in the EV sense, we’d gain every time. However, it doesn’t automatically hold should we misjudge his strategy, mixed or not. If we do so to such a degree as to render our current decision sub-optimal, then we are losing out, contradicting the theorem. So whether our opponent’s decision is theoretically determinable or not, judgemental failure can always be the theorem’s undoing.

Advocates of the FToP could respond by drilling down further into EV argument by claiming players would be more informed of the right strategy, and thus more likely to make a better decision, and so gain. So where are we heading: if you see your opponent’s cards you’ll probably make a better decision than if you hadn’t? That’s sounds quite fundamental, and pointless.

One must assume the failure to condition the theorem on strategy to be an oversight, since it could easily be shown on a hand-by-hand basis to fail in many instances.

Another apparent implicit assumption is that perfect decision-making follows from complete or sufficient information: it doesn’t. It takes knowledge, skill and discipline to turn information into effectiveness: it’s not a given. Proficiency in probability is requisite to determine the right decision; in addition, execution of the correct strategy is a non-trivial task in itself, requiring material discipline - players will often weigh up what they want to do as well as what they should do. Despite casinos allowing us to see ‘all their cards’ many a shrewd winning gambler can be found propping up a house game. In the same way such savvy punters knowingly brave the odds and execute losing bets at roulette, I suspect all poker players will, at some time, take on draws they know aren’t justified. In fact, given this trait, it is easy to illustrate how knowing an opponent’s cards could wittingly guide to a poorer decision.

It’s the turn: hero faces a bet and the certainty a gutshot is his only hope. The 3-flush board of this strongly played hand leads him to deduce one of his outs risks handing his opponent a flush and accept he may already be drawing dead; so the awful 9-1 offer to complete the hand is wisely declined. However, had he gleaned his opponent’s cards and observed his outs to be clean – he’d call, wilfully accepting the marginally poor odds [2]. The -EV decision could easily be vindicated from a number of meta-game [3] positions, but, of course, it won’t always. And when it isn’t the FToP fails since the foe gains when hero pursues a course of action he’d only undertake in light of his adversary’s holding and loses out when he doesn’t; when hero folds. So even though the theorem is not openly predicated on rational decision-making, even if it were such decisions shouldn’t be presumed irrational or indeed losing, particularly when taking a systemic view of a poker player [4].

One could surmise FToP reasonably assumes some requisite level of poker-competency. However, these competencies are non-trivial; in fact, strict satisfaction would be very untypical. Only a minority of players are cognisant of, say, the precise pot odds required to call when holding 3-6o against an all-in k-10. Moreover, even with complete information, still fewer will know, or can calculate dynamically, the exact odds of every situation arising; I’d fail on the first and naturally the second. Furthermore, there is still the process of determining the odds you’re currently getting against the odds required; not hard, but mistakes happen (especially counting the chips live). Finally, there is the requirement to execute the decision your poker-brain knows to be right. Who hasn’t failed this task? Most of us, routinely. So it appears nigh on the entire poker population fails the requisite competency-levels expected for this fundamental theorem to hold.

Sklansky ambitiously and mistakenly, in my opinion, attempts to craft a theorem, and a fundamental one no less, out of how people, the reader, make decisions and would respond to information. It would remain contentious and flawed, but more forgiving, were it to purport how people should, not would, behave and so benefit.

A fundamental theorem on individual human behaviour: Nobel Prize material indeed.

Next Article: part (ii), Friday 24th (edit) November

[1] EV – Expected Value, an unfortunate term as it often fails to represent value and is seldom expected.

[2] Ignoring implied odds.

[3] meta game: the impact of the current decision on future hands.

[4] The theorem would fail in some cases even when negative EV decisions are justified w.r.t. meta-issues; since, invariably, at least some of hero’s meta-gain won’t be endured by this opponent.

Wednesday, October 18, 2006

move on, come rain or shine

A change of tact delayed this month’s posting - apologies. Perusing the high stakes forum at 2+2, I ran across this interesting post. Though he admits to not following his own advice, the author concludes:

Anyway - when you running very bad or very hot - it's harder to get the maximum EV from this situation compared to when you running normal, cause stats of some players changed instantly (and PT can't catch this) and it takes time to find the players who really start taking shots/start to made too many folds against you.

So it's easier to change a table or take a break quitting the complicated analysis of players...

The poster advocates quitting when significantly winning or losing because the context generated by either position forces one to deviate from a more typical playing strategy: more extrapolation, less interpolation. As such our judgement is warped, decision-making is tougher. Undoubtedly every player has seen routine decisions transform into real quandaries under the climate of running at extremes.

However, even though our decision-making becomes less efficient, it can increase in effectiveness – even with a greater error-count [1]; despite acting less optimally, we might still expect to gain. Conditions change for everyone: success is to adapt quickly and advantageously, not to match or improve on performance metrics achieved under normal conditions. It’s a different game.

The presence of asymmetric information, the culprit of adverse selection, appears to hatch the belief, here, that winning or losing reference-states are penalised. It is common-place in, and to some degree particular to, on-line poker, it seems, to be largely unaware of an adversary's observations and thus ignorant (at least initially) of any factoring in of your reference-state into his strategy; however, he will be perfectly aware of any adjustments, or lack of them. A normal strategy would be deployed against one oblivious to your standing, to an opponent aware of it, an adapted one. Since we know not if our rival is fish or fowl, we must compromise or risk being compromised: not an issue, or at least less of one, under routine conditions. So, comparatively, we are likely to under-perform and execute more errors; your opponent appears to have you at a disadvantage. The solution, easily affordable on the internet, is to quit, reinitialise the variables and start elsewhere under the (likely) realisation we are observed as neither winning nor losing.

However, there is an oversight: the advantage of asymmetric information is not held solely by your opponent. Your foe, for example, isn’t aware if you are planning on his game to be normal, adaptive or some compromise: you are. We can reflect dilemmas to and fro indefinitely: has he presumed your response to be normal or adapted? You don’t know; he does. And so on. It seems reasonable to presume each dilemma (or item of information) is weighted and so in any given exchange summing these weighted advantages should determine who net gains from the context driving these imbalances.

On a more practical note, suppose against his veiled strategy you deliver your standard game; a player unaware of your reference-state is likely to adopt his typical playing style (for you) and so maintain the status quo. Were, though, he to be conscious of your winning or losing state and adopt, in his eyes, an appropriate adaptive-strategy then the chances are your strategy will be less efficient (or sub-optimal for you). However, he too is inappropriately applying a strategy - an adaptive game to a normal one. On this superficial evidence it is not apparent who has the upper-hand. Who gains: who knows? At a guess, whichever strategy is more robust, and, naturally, whoever adapts to their opponent’s true strategy the quickest, and most effectively[2].

Of course individual reasons exist to quit when winning or losing; however, increases in sub-optimal play typical in a change of climate, or indeed, the generation of additional asymmetric information, aren’t necessarily among them.

Next article: Friday Oct 27th: Sklansky's theorem: fundamentally flawed (i).

[1] Error count is a poor metric since errors vary with significance; however, the arguments hold even if we view ‘error-count’ as ‘weighted-error count’ or some measure of optimality.

[2] you'd certainly expect the guy playing less tables to to be advantaged.