poker dynamics: mindset (i)

Consider the following two attitudes:

to win;

to play well;

Sports’ coaches and their ilk would reckon to steer a truck between the pair. The tradition of fair play allied with putting up ‘a spirited fight’, supposedly endemic to the British psyche, is routinely cited as the anathema of the nation’s sport.

‘Winning ugly’ was doubtless schooled into hard-knocks poker grads long before the intense Brad Gilbert coined the phrase. An object lesson in the above mindset-distinction should seldom be required in poker: few competitive environments exist where the glorious failure oxymoron is less self-evident. Nevertheless, some players – most of us to an extent – will opt to lose, or certainly profit less, than adopt a counter-machismo style.

In a sport as tennis a winning strategy, in theory, is self-similar at all levels. In other words, in order to maximise the likelihood of winning a set, one must aspire to win every game; likewise, the requirement is on each point to win games. Naturally, meta-issues surface: injury or fatigue will occasionally necessitate the submissive relinquishing of a forlorn set, or game; shots deemed ‘low percentage’ will serve as loss leaders from time to time. Those issues aside, the game is in structure, strategically self-similar. However, many sports witness combatants purposefully adopt localised losing-strategies in order to protect leads or chase victories (and so not follow self-similarity) [1].

In poker, the seductive strategies tend not to be self-similar: opting to maximise potential profit in a hand will not maximise the year’s potential profit [2]; minimising short-term risk, will not minimise long-term risk [3]; lines optimising the chances of winning individual pots, will rarely return the best chance of winning over a clutch of hands let alone a session, or year. Material meta-issues aside, a suitably bankrolled guy maximising EV from each pot, though, will maximise EV over the session, a year [4]. So what’s the problem with EV?

Now, with the gymnast there’s no reward for deviation, no punishment for accuracy. There is determinism: better execution will reflect in better marks; improve one element - improve the whole (ceteris paribus). However, in such sports as golf, darts and snooker the penalisation of superior accuracy and/or technique, abounds. But not sufficient to confuse, mask their overall benefit [5]. Nevertheless, as a consequence, the random walks of these sporting-events witness wrong steps move in the right direction and vice versa.

As we move to more game-theoretic and high-impact low-probability event-driven sports, confusion and uncertainty often rein free [6]. As such stakeholders (interested parties) seek frameworks within which to operate, rules to follow, conventions to reinforce, superstitions to suffer and so on. All in a bid, it seems to: bring order; certainty; mitigate regret, in particular, counterfactual regret. The ease with which one can visualise or imagine not losing in the manner lost, will determine to a large extent the level of regret experienced in losing said way. As such stakeholders, decision-makers, will anticipate contrasting levels of counterfactual regret with candidate options, and so attach to them varying levels of anticipatory regret (ex ante); this regret will doubtlessly filter (or attempt to) into the decision-making process.

Put simply, if doing that which would’ve won the game was something seldom done, there should be little cause for regret; conversely, if failing to do the very thing which is done routinely causes losing, well, you’ll be immersed in it. Naturally, this is (typically) known in advance of the decision being taken, or event occurring, and so the emotions are anticipated and so, inevitably, affect. It is therefore, arguably a strong mind for which a great number of decisions are in-play (Jose Mourinho, tactical substitutions after 20 minutes).

In this class of sport, any (interesting) two situations are seldom the same and similar ones will often be too infrequent to draw meaningful conclusions. As such learning is difficult; a move towards a better strategy might not realise immediate or obvious benefit (expected or otherwise). If the strategy breaks convention, and, or, upon failure, increases the level of counterfactual regret suffered by stakeholders, the expected gain could well be overshadowed by the personal risk now incumbent on the decision-maker.

Revered ex-England cricket captain, Mike Brearly, observed a marked difference in the recrimination imparted on a captain’s retrospectively poor toss-decision by the type of failed decision [7]. When the batting team fails in the first innings they are seen as disadvantaged but very much in the game: they could still win. However when the batting side succeeds - the bowling team concedes many runs, but a few wickets at stumps on day 1 - the bowling side are deemed to be batted out of the game: they can hope for but a draw. So, it appears, to suffer a disadvantage having electing to bowl is more lamentable than a comparable fate after opting to bat.

This intuitive feel-good logic is attractive and rational, but hardly rigorously analyzed. It is somewhat akin to such poker rationale as: ‘if you’re not in the game on day 2 of the WSOP, you can’t win: make sure you’re there - you got to be in it, to win it’. Nice sentiment, but optimal? No, not likely. Batting first effectively puts a cap on how bad things are at the close of play; but does not to testify as to how good things, or importantly, how things are likely, or expected, to be. So the cricket captain battles both convention and anticipatory regret [8].

A footballer’s (soccer) missed scoring opportunity is typically less regrettable when ‘working the keeper’ than on occasion of missing outright; so a strategy, or adjustment, boasting a marginal improvement in scoring, but material hike in misses, could, under the sufferance of criticism and/or self-recrimination, become distorted and feel less, not more, effective. Indeed, for some, it would be self-fulfilling, as confidence wanes under the blame-burden. Still even with a clear mind, issues of personal-risk, aversion to criticism and blame might adjudicate maintaining the status quo the judicious choice.

Coaches often, or are inclined to, alter tactics, personnel if the game-state is undesirable. Once again there is a regret issue, to not change, is to not try (“do something!”); few upon failure, will grumble ‘you’ll never know what would have happened had you not made those changes’ [9]. Of course the game-dynamic is typically different and so change is legitimate. However, if, post-change, the team dominates possession, creates a multitude of chances, but fails in the end-goal, one might legitimately state winning to have been a ‘close-call counterfactual reality’: they nearly did it.

However, it is a leap to conclude the team were benefitted (in an expected sense or otherwise) since the performance improved post-intervention. Perhaps the decision was fortuitous, rather than astute; or indeed, the game-state suggested any change would trigger improvement [10]; in addition, games can and do progress organically, so the non-interventionist counterfactual reality may outperform the interventionist one. Nevertheless, informed decisions should correlate with favourable “factuals” (outcomes), as they should with positive close-call counterfactuals (nearly good outcomes), and, naturally, poor decisions map to negative close-call counterfactuals (& factuals). So the decision-maker, in theory, is better equipped, more advised, when cognisant of both: the factual and the close-call counter-factual.

In such environments marginal changes do not have marginal effects - where it matters. Which, as suggested, hinders learning; however, at least the football coach is availed discriminating metrics, beyond the score line, with which to measure the impact of change.

Instead consider the poker player pondering the big-bet call-down. There are no close-call counterfactuals here; it’s binary – miss or monster. No bluff-percentages accompany the showed hand, nothing to help realign a wayward strategy, save a potentially misleading showdown. Decisions typically feel a good deal less marginal ex-post, than they do ex-ante (assuming hand disclosure); which rather worryingly suggests a somewhat selective or perhaps suppressed world-view, either before or after the fact.

So learning through EV is tough: decisions are rarely awarded the amounts ‘expected’ (the feedback (profit or loss), therefore, invariably distorts); meta-issues abound - the value, or cost radiates over a multitude of hands; scenarios exist without the edifying close call counterfactuals.

The idea is, of course, to learn over the long haul. Unfortunately, brain-accountancy is typically shoddy, the database unreliable and lacking in the necessary detail. What’s more, lessons are scattered, not learnt in chunks. Akin, perhaps, to hopping from one language to the next, after each new word, facet of grammar understood.

So, we budding players have a torrid time; mind you, not that I’d swap the monitor and mouse, for the high-bars and rings.

Next article: part ii

[1] Say, the very defensive golf-shot of the tournament leader at the 18th; the advancing goalkeeper of a trailing football (soccer) side.

[2] Note this is to survey the possible lines and select the option corresponding to the greatest attainable reward (eg raising, rather than a sensible call hoping, by chance, to be called by a weaker hand). While potential for a mother-of-all-runs could technically be viewed as technically self-similar, it is a somewhat facile point.

[4] Over-cautious strategies mitigating the risk of busting out of a game, may, through loss of EV, increase the chances of busting the bankroll.

[5] Although of course, there is often considerable uncertainty over which technique generates the best results: a golfer reinventing a swing. So there is technique and execution; which applies to poker too, at a high level.

[6] So for example, in football (soccer) a goal is a very infrequent event during a match, compared, to say, a pass, or a free-kick, but of course has very high impact since goals absolutely determine success or failure.

[7] Cricket, specifically, is a game played over up to 5 days, where a team must bowl out 10 batsmen from the opposition twice in order to win; if both teams achieve this then the team scoring the highest runs wins; if neither are able to do so, the game is a draw. Therefore, it’s hard to envisage a side losing if they’ve scored a lot of runs for the loss of a few wickets (batsmen) during their first innings on day 1.

[8] They are clearly not independent, and doubtless reinforce each other.

[9] Except, when of course changes were instigated from a favourable, winning, position.

[10] A so-called secondary counterfactual – at least in a qualitative sense. Although the specifics of the improvement effected could only occur with that explicit change, it might be argued, a number of alternate alterations would have advanced matters uniquely, too. So improvement upon change was virtually inevitable.

poker dynamics

Friday, February 08, 2008

mindset (i)

Previous Posts

About Me