Monday, February 11, 2008

mindset (ii)

London buses indeed.

Consider once again the performance drivers: playing well; playing to win [1]. On inspection, one might view them interchangeable, synonymous: a player driven to play well will win, anyone driven to win must play well.

Logical cracks surface, though, under light analysis: winning is a defined, unambiguous state; ‘playing well’ is non-absolute, typically subjective and often relative. Winning is an effect; decisions are causes.

The drivers, while correlated, are distinct - it is not a purely causal relationship. Winning is a state often achieved playing poorly; moreover, playing well doesn’t guarantee winning. Variance is the chief culprit, but also both in the subjective and absolute ways performance is typically measured. Ordinarily, playing your best game (the subjective) or even playing a ‘great game’ (the absolute) will be sub-standard versus top-class opposition. Conversely, antithetic, contrasting, performances might suffice against weak adversaries.

Additionally, winning, or being a winning player, in like passing an exam, is qualitative: standard-of-play metrics, however, confer grading. Consequently, the win-driver relents, runs out of steam, somewhat, when attaining the win-state; the play-well driver, though, is the duracell-bunny.

Of course it is reckless to pay heed only to the playing-well driver: playing better, than our opponents, is pretty important too. Since the subjectivity inherent in measuring personal performance requires intermittent, at least, objective validation, we inevitably become sucked in by the winning-driver: it’s not profitable playing well with superior opponents. We must care about winning too [1].

The following hypothetical gambits contrast the mindsets.

A bursary of $2000 is awarded for attaining some goal over a six-hour session of $5-10 limit hold’em. In the first instance only a profitable session is required; in the second, the sum is awarded if the performance is deemed accomplished over the duration.

In the latter case the focus is clear: play great poker. Against the backdrop of a $2000 bonus, who cares if you win or lose? The object is to execute credit-earning decisions; perhaps a check-fold, bluff check-raise on the river, slow-playing AA pre-flop etc. Sure decision-making is tough, but there is only one agenda; it’s about as uncluttered as poker gets.

In the first scenario, right from the off, we sweat the balance sheet. Winning attracts a preservation-mindset, losing switches on hunt-mode: the former leads to conservative plays, the latter to lines of high variance. At times, rightly so: an increased chance of landing the 2k prize might easily compensate expected losses suffered during a hand. Inevitably, though, added complexity and pressure leads to overly conservative/risky strategies – especially when time runs out on an uncertain outcome.

But so what, we’re discussing a hypothetical, unrealistic, more complex and seemingly pointless problem. The scenarios, though, albeit hypothetically, inject reward, value into the decision-making process - it just so happens to be fiscal. Which, in theory, facilitates analysis: resulting, in the first case, to purported technical adjustments.

Unfortunately, added incentives are not fiscal. Daniel Kahneman and Nobel prize winner Amos Tversky reasoned people were, typically, loss-averse. They were not merely affirming the barefaced truth the populous dislike losing; rather, they, inferred the loss-state bore an additional cost, beyond the loss itself - derived from countless examples of consistent economically irrational decision-making. Most of us for example, experience a greater emotional differential between $50 win/lose states, than between, say, gains of $50 and $150; unless the sums involved are critical, in some way, this appears irrational.

Loss-states aren’t necessarily zero-sum: gain-states frequently occur in life without a complimentary loss-state (& vice versa); in poker, few feel as replenished in victory as they do damaged by defeat [2]. Poker, especially, tournament-poker, seductively potrays an image of fair competition; losing, therefore, inevitably baits underachieving sentiments - a natural, impulsive but somewhat imprudent response, since no two players sit the same exam. Another candidate meta-cost is the losing process; in poker, unlike most gambling, when you lose, for example, you lose to someone[3]. So we are averse to the tangible-loss, the loss-state, competitive-failure, the losing process.

We continuously credit and debit these and other mental accounts, though, they trade in different currencies - if play-well is in credit, so what if the ego-account is in the red? That's fine: as long as we don't hold the exchange rates. Trade-off, though, as is done routinely in day-to-day multi-criteria decisions, we compromise our play. Money is quite meaningless without emotion; a prospective purchase is seldom contemplated without assimilating the emotional pay-off. So unfortunately, the exchange-rates may already be in place, or could be sensibly extrapolated. That said, decisions in poker frequently surface under a cloud of emotion, not typical, seldom present in day to day financially-oriented decisions.

Next article

[1] In this context ‘playing well’ represents a drive to execute good decisions, not to put up a ‘decent performance’. So the driver could easily be to ‘play perfectly’, ‘your A game’ etc

[2] That is w.r.t symmetric states (win/lose $500); rather than, naturally, in tournaments.

[3] which you’d suspect may in part, be offset by winning off someone, but, arguably, not equitably.

Friday, February 08, 2008

mindset (i)

Consider the following two attitudes:

to win;

to play well;

Sports’ coaches and their ilk would reckon to steer a truck between the pair. The tradition of fair play allied with putting up ‘a spirited fight’, supposedly endemic to the British psyche, is routinely cited as the anathema of the nation’s sport.

‘Winning ugly’ was doubtless schooled into hard-knocks poker grads long before the intense Brad Gilbert coined the phrase. An object lesson in the above mindset-distinction should seldom be required in poker: few competitive environments exist where the glorious failure oxymoron is less self-evident. Nevertheless, some players – most of us to an extent – will opt to lose, or certainly profit less, than adopt a counter-machismo style.

In a sport as tennis a winning strategy, in theory, is self-similar at all levels. In other words, in order to maximise the likelihood of winning a set, one must aspire to win every game; likewise, the requirement is on each point to win games. Naturally, meta-issues surface: injury or fatigue will occasionally necessitate the submissive relinquishing of a forlorn set, or game; shots deemed ‘low percentage’ will serve as loss leaders from time to time. Those issues aside, the game is in structure, strategically self-similar. However, many sports witness combatants purposefully adopt localised losing-strategies in order to protect leads or chase victories (and so not follow self-similarity) [1].

In poker, the seductive strategies tend not to be self-similar: opting to maximise potential profit in a hand will not maximise the year’s potential profit [2]; minimising short-term risk, will not minimise long-term risk [3]; lines optimising the chances of winning individual pots, will rarely return the best chance of winning over a clutch of hands let alone a session, or year. Material meta-issues aside, a suitably bankrolled guy maximising EV from each pot, though, will maximise EV over the session, a year [4]. So what’s the problem with EV?

Now, with the gymnast there’s no reward for deviation, no punishment for accuracy. There is determinism: better execution will reflect in better marks; improve one element - improve the whole (ceteris paribus). However, in such sports as golf, darts and snooker the penalisation of superior accuracy and/or technique, abounds. But not sufficient to confuse, mask their overall benefit [5]. Nevertheless, as a consequence, the random walks of these sporting-events witness wrong steps move in the right direction and vice versa.

As we move to more game-theoretic and high-impact low-probability event-driven sports, confusion and uncertainty often rein free [6]. As such stakeholders (interested parties) seek frameworks within which to operate, rules to follow, conventions to reinforce, superstitions to suffer and so on. All in a bid, it seems to: bring order; certainty; mitigate regret, in particular, counterfactual regret. The ease with which one can visualise or imagine not losing in the manner lost, will determine to a large extent the level of regret experienced in losing said way. As such stakeholders, decision-makers, will anticipate contrasting levels of counterfactual regret with candidate options, and so attach to them varying levels of anticipatory regret (ex ante); this regret will doubtlessly filter (or attempt to) into the decision-making process.

Put simply, if doing that which would’ve won the game was something seldom done, there should be little cause for regret; conversely, if failing to do the very thing which is done routinely causes losing, well, you’ll be immersed in it. Naturally, this is (typically) known in advance of the decision being taken, or event occurring, and so the emotions are anticipated and so, inevitably, affect. It is therefore, arguably a strong mind for which a great number of decisions are in-play (Jose Mourinho, tactical substitutions after 20 minutes).

In this class of sport, any (interesting) two situations are seldom the same and similar ones will often be too infrequent to draw meaningful conclusions. As such learning is difficult; a move towards a better strategy might not realise immediate or obvious benefit (expected or otherwise). If the strategy breaks convention, and, or, upon failure, increases the level of counterfactual regret suffered by stakeholders, the expected gain could well be overshadowed by the personal risk now incumbent on the decision-maker.

Revered ex-England cricket captain, Mike Brearly, observed a marked difference in the recrimination imparted on a captain’s retrospectively poor toss-decision by the type of failed decision [7]. When the batting team fails in the first innings they are seen as disadvantaged but very much in the game: they could still win. However when the batting side succeeds - the bowling team concedes many runs, but a few wickets at stumps on day 1 - the bowling side are deemed to be batted out of the game: they can hope for but a draw. So, it appears, to suffer a disadvantage having electing to bowl is more lamentable than a comparable fate after opting to bat.

This intuitive feel-good logic is attractive and rational, but hardly rigorously analyzed. It is somewhat akin to such poker rationale as: ‘if you’re not in the game on day 2 of the WSOP, you can’t win: make sure you’re there - you got to be in it, to win it’. Nice sentiment, but optimal? No, not likely. Batting first effectively puts a cap on how bad things are at the close of play; but does not to testify as to how good things, or importantly, how things are likely, or expected, to be. So the cricket captain battles both convention and anticipatory regret [8].

A footballer’s (soccer) missed scoring opportunity is typically less regrettable when ‘working the keeper’ than on occasion of missing outright; so a strategy, or adjustment, boasting a marginal improvement in scoring, but material hike in misses, could, under the sufferance of criticism and/or self-recrimination, become distorted and feel less, not more, effective. Indeed, for some, it would be self-fulfilling, as confidence wanes under the blame-burden. Still even with a clear mind, issues of personal-risk, aversion to criticism and blame might adjudicate maintaining the status quo the judicious choice.

Coaches often, or are inclined to, alter tactics, personnel if the game-state is undesirable. Once again there is a regret issue, to not change, is to not try (“do something!”); few upon failure, will grumble ‘you’ll never know what would have happened had you not made those changes’ [9]. Of course the game-dynamic is typically different and so change is legitimate. However, if, post-change, the team dominates possession, creates a multitude of chances, but fails in the end-goal, one might legitimately state winning to have been a ‘close-call counterfactual reality’: they nearly did it.

However, it is a leap to conclude the team were benefitted (in an expected sense or otherwise) since the performance improved post-intervention. Perhaps the decision was fortuitous, rather than astute; or indeed, the game-state suggested any change would trigger improvement [10]; in addition, games can and do progress organically, so the non-interventionist counterfactual reality may outperform the interventionist one. Nevertheless, informed decisions should correlate with favourable “factuals” (outcomes), as they should with positive close-call counterfactuals (nearly good outcomes), and, naturally, poor decisions map to negative close-call counterfactuals (& factuals). So the decision-maker, in theory, is better equipped, more advised, when cognisant of both: the factual and the close-call counter-factual.

In such environments marginal changes do not have marginal effects - where it matters. Which, as suggested, hinders learning; however, at least the football coach is availed discriminating metrics, beyond the score line, with which to measure the impact of change.

Instead consider the poker player pondering the big-bet call-down. There are no close-call counterfactuals here; it’s binary – miss or monster. No bluff-percentages accompany the showed hand, nothing to help realign a wayward strategy, save a potentially misleading showdown. Decisions typically feel a good deal less marginal ex-post, than they do ex-ante (assuming hand disclosure); which rather worryingly suggests a somewhat selective or perhaps suppressed world-view, either before or after the fact.

So learning through EV is tough: decisions are rarely awarded the amounts ‘expected’ (the feedback (profit or loss), therefore, invariably distorts); meta-issues abound - the value, or cost radiates over a multitude of hands; scenarios exist without the edifying close call counterfactuals.

The idea is, of course, to learn over the long haul. Unfortunately, brain-accountancy is typically shoddy, the database unreliable and lacking in the necessary detail. What’s more, lessons are scattered, not learnt in chunks. Akin, perhaps, to hopping from one language to the next, after each new word, facet of grammar understood.

So, we budding players have a torrid time; mind you, not that I’d swap the monitor and mouse, for the high-bars and rings.

Next article: part ii

[1] Say, the very defensive golf-shot of the tournament leader at the 18th; the advancing goalkeeper of a trailing football (soccer) side.

[2] Note this is to survey the possible lines and select the option corresponding to the greatest attainable reward (eg raising, rather than a sensible call hoping, by chance, to be called by a weaker hand). While potential for a mother-of-all-runs could technically be viewed as technically self-similar, it is a somewhat facile point.

[4] Over-cautious strategies mitigating the risk of busting out of a game, may, through loss of EV, increase the chances of busting the bankroll.

[5] Although of course, there is often considerable uncertainty over which technique generates the best results: a golfer reinventing a swing. So there is technique and execution; which applies to poker too, at a high level.

[6] So for example, in football (soccer) a goal is a very infrequent event during a match, compared, to say, a pass, or a free-kick, but of course has very high impact since goals absolutely determine success or failure.

[7] Cricket, specifically, is a game played over up to 5 days, where a team must bowl out 10 batsmen from the opposition twice in order to win; if both teams achieve this then the team scoring the highest runs wins; if neither are able to do so, the game is a draw. Therefore, it’s hard to envisage a side losing if they’ve scored a lot of runs for the loss of a few wickets (batsmen) during their first innings on day 1.

[8] They are clearly not independent, and doubtless reinforce each other.

[9] Except, when of course changes were instigated from a favourable, winning, position.

[10] A so-called secondary counterfactual – at least in a qualitative sense. Although the specifics of the improvement effected could only occur with that explicit change, it might be argued, a number of alternate alterations would have advanced matters uniquely, too. So improvement upon change was virtually inevitable.

Thursday, February 07, 2008

sklansky's theorem: fundamentally flawed (iv)

Though entirely proper to test, disprove, any theorem, perhaps, the most pertinent inquiry of Skalnsky’s is to ask: is it useful?

The general interpretation or use of the theorem appears in the validation of decisions through hand playbacks with opponents’ cards face up: if performed the same way, the boy done good.

I’ve personally found the process to placate rather than inform. After bad-beats, I’ve consoled myself with: ‘well I’d have played the same if I’d known what he had’. Such thinking is quixotic, since, although the strategy, face-up, might be seen as perfect; face-down, it could be dire. As poker players, we seldom deal with certainties.

On occasions where a decision is marked retrospectively correct in this contrived manner, but on the (judgement-based) balance of probabilities at the time, deemed wrong, applying hindsight-analysis will likely mislead and so hinder learning.

A conflict, of sorts, arises when we witness a respected, more informed player deviate from our selected path or option. We are, instinctively, compelled to re-evaluate, coaxed, perhaps, into presuming we were in error and tempted to mimic (or migrate towards) the play [1]. However, this can resemble the above trap - his balance of probabilities, worldview (not to mention image) contrasts our own; a superior player is found, typically, to be better informed, in the same sense, but to a lesser to degree, as someone privy to hole cards: if our tools, skills, are different then so must be our models, and our answers [2].

Consider an amateur weatherman using an historical 3-day forecast algorithm [3]; light showers are predicted, at a low-confidence level, for the following morning. The amateur weatherman baulks at the prediction and forecasts a dry day. Later that evening, after submitting his prediction, he listens out for the Markov Amateur-Weathermen Society’s forecast, which utilises a sophisticated 10-day algorithm: a clear day is predicted – with a high level of confidence. The day was indeed free of rain. So the amateur's forecast was correct, but the decision? With better tools, more information, the weatherman will, perhaps, claim to have reached such a forecast anyway; however, contradicting his basic algorithm, whimsically, is clearly gambling against the odds, which, ultimately, is a strategy, destined to under-perform.

Of course, inevitably, as we develop, the cart will, on occasion, precede the horse - we are seduced into reproducing the effects (plays) of better models, so our model resembles it, appears similar - to kid ourselves we’re improving; nevertheless, in the poker world, one might argue extrapolating this way, persisting in (based on the current model) erroneous decisions, may render a speedier, albeit costlier, progression - at least for some.

To conclude, it is, surely, advantageous to measure our decisions based on our interpretation of the information gathered, not what wasn’t, or couldn’t. The value in ascertaining the correctness of a decision based on a near utopian-level of awareness is far from apparent.

Next article

[1] Which of course can edify (& so develop the model), but also regress if applied without the necessary insight. E.g. value-betting weak hands.

[2] Obviously, not on every occasion.

[3] i.e. it uses the previous 3-days weather to predict the next.