Monday, July 21, 2008

the skill factor

This is a slightly modified article written a few years back. It’ll be the last one for the foreseeable future.

Which version of poker requires the greater skill? We’re all biassed one way or another; prone, perhaps to vouch for which we know, or succeed at – whether it's because we witness greater depth or are driven to service our ego. Still, for others it will be the converse, the games foreign seem more complex, leaving us unsure underfoot. This article will at best only vaguely inform, on what is, a rather ambiguous and somewhat intractable question.

Poker, like most games, is a combination of both luck and skill. It is, it seems, entirely rational to state a game more dependent on luck to be a less skillful game. Classifications, of say, ‘80% luck, 20% skill’ are tempting, and arguably, provide some insight; however, it is not meaningful to qualify a game’s skill-level this way. Our personal ranking order of skillful games doesn’t, nor shouldn’t, faithfully follow some simplistic percentage-luck metric: compare snap and tic-tac-toe to poker, backgammon and bridge. Are the skillful games more or less reliant on luck? There are qualitative and quantitative attributes of skill in a game.

Those who’ve undertaken some basic measure of computer programming will likely recall being tasked to code a sorting algorithm. A common poser is to orgnaise a group of numbers into numerical order, and or, possibly, the more taxing version, to sort a list of words alphabetically. Each rinse of the program creates a more ordered list [1], until, the task is completed – within a determinable maximum number of loops.

One might argue a poker tournament serves as one cycle of a sorting algorithm for players, albeit a somewhat stochastic one: the list might easily become less entropic after one tournament loop. However, you’d figure, eventually, given enough tournaments to arrive at a perfect ordering of ability for 100 fixed-skilled players. Though, unlike, the word and number sorting algorithms there is no upper-bound on the number of loops guaranteeing a completed task. In the real world of poker, though, the very thing (skill levels) set out to be ordered in the first tournment, no longer exist at the start of the second: abilities change.

It appears quite apparent, in tournament poker, Limit-Hold'em (LH) orders better than No-Limit (NL) : skilful Limit-Holdem players should, like for like, expect to outperform their NL counterparts. Still, even were such an assertion shown to hold true, it in no way testifies to a more skilful game, or skilled players. What it would evidence, is limit-holdem discriminating better between players, than NL. Thus one’d anticipate the limit version of holdem to establish rank more readily than NL in our iteration of poker tournaments.

Perhaps one way of attempting to model Limit-Holdem would be to identify it as a series of multiple-choice questions; NL might also be approximated similarly, but of course with a greater number of choices, albeit approximated.

Now, it is without modesty, that I can currently claim to hold both a better vocabulary, mathematical insight than my 7-year old neice. However, it is, clearly, trivial to set 50 mutiple-choice mathematical questions, or tests of vocabulary, to which I could fail to hazard any sort of meaningful guess. Such questions might tax the knowledgeable, or brilliant, but they’d fail to discriminate between our respective abilities, or knowledge.

A continual lowering of standard, will, eventually, yield the odd problem I’d reckon on tackling, or guess at educatedly. Yet in spite of this edge, I could easily lose, since in only a handful of instances are my responses an improvement on outright guesses. But as the quality is relaxed still further, a point should eventually arrive, hopefully, where I’d anticipate confidently answering all 50 questions, and where she is still forced to guess at all of them. At this level the test discriminates very well (but measures poorly) between our respective knowledge of the subject.

Continued simplification will witness a juncture where we are both able to answer all 50 questions: once again the test fails to discriminate. Clearly, the level of difficulty mostly likely to discriminate between candidates isn’t the one offering the superior challenge or requiring the greatest degree of skill or understanding, or naturally, the simplest one either.

Bidding to analogise more meaningfully with NL/Limit hold'em tournaments debate, we might choose to spice up the multi-choice. Suppose in a discriminating multiple-choice test we implenent a scoring system tolerating 3 mistakes; the score awarded is the running score when the third error occurs. Then run it with, say, with 6 strikes, or one strike: which of the tests is likely to reveal a more representative order of ability?

While the questions are the same in both cases the penalty for failure isn’t. One could easily design a test that would both discriminate and challenge more than another on merit alone, but if the penalty system employed was stern enough, it’d be expected to do less well at sorting out the participants in order of ability: higher variance. It certainly appears NL hold’em has a higher penalty system for poor decision-making*: the limit version yields more lives. That said, employing a heavy penalty system could serve as a better discriminator too. In life, for example, one person may get more day to day life-decisions right than another person, but may have a tendency of getting the important ones wrong. Which of them will be the happier? But if a set of tennis were settled by the first unforced error from either player, what odds for Nadal at the French Open?

The weighting or importance assigned to different decisions or outcomes is crucial when trying to ascertain an individual's effectiveness, ability or, indeed, to establish a proper order of things. Calling down a possible bluff on the river in NL often holds a far greater penalty for failure than such a call in Limit-Hold’em. It might appear that the ‘weights’ are a little kinder to the limit player.

So, in conclusion: the skill factor in a game is measured in terms of quality as well as quantity; a game that more readily discriminates between players isn’t necessarily more challenging or more skilful; heavier penalties for poor decision-making can level the playing field; it is not a paradoxical to state a game to be both more luck and skill dependent than another.


*Note, by suggesting a game has a heavier penalty system one is not implying it is more reliant on luck. A game can be deterministic, free from luck and uncertainty, yet still employ a penalty system. One might consider the luck aspect of a game to be the extent to which poor decisions can be rewarded and, consequently, good ones penalised or indeed how much of the game is free from decision-making all together.

Monday, May 05, 2008

dreaded delays

A couple of years ago I ran across the following article in a chance purchase of a scientific magazine:

Dread lights up like a pain in your brain

The first trial apparently is designed to verify the presence of dread and infer a relationship with time. Subjects are offered a choice between: a quantity of physical pain with a short delay; the same quantity of pain and longer delay.

p1 + d1 > p2 + d2 where p1=p2 and d1>d2


Nurturing a basic assumption on dread, it’s a no-brainer: the former dominates the latter. Except, for those, 16%, with unorthodox value systems. The second trial, more interestingly, sets out to determine if dread and pain are tradeable currencies. The subjects choose between:

p1 + d1 and p2 + d2, where p1>p2, a1 < a2.

Neither option dominates, so subjects are now forced, non-trivially, to consult their value systems [2] to trade-off the delay and shock differences expressed by the two options; less pleasantly, but comparable, as to how one might negotiate choosing between a car of superior drivability but inferior fuel efficiency, to another.

The second trial might elicit, or bias, perceived rational-choice, rather than value or preference: it appears irrational to opt for additional pain, at the expense of waiting. Dread is a state-of-mind – it’s going to happen, why worry about it: dread is internal. Whereas physical pain is real, validated, somewhat externally (by the electrodes!), thus a more plausible cost. So, perhaps, some of the subjects are predisposed to answer no to an increase in pain, rather, than attempt to trade-off, two very different and so difficult to trade, metrics. Though, arguably, such rationale might advance more readily on the philosophical thought-experiment plane, than, well, when wired up experiencing both forms of anxiety.

Asymmetry: if waiting for pain is painful, shouldn’t waiting for pleasure be comparably pleasurable? Nature is cruel; still what would our ancestors have got done?

Hyperbolic discounting is the term ascribed by economists to the preference for low-early rewards, over higher-late ones: conferred to psychologists, one assumes, to ascertain why. One might conjecture it is as a consequence of our, mostly, innate inability to delay gratification; or, an intrinsic (related) tendency to discount value to our future selves.

A friend winning £5000, one hopes, will evoke cheer in us; mirror neurons fire-up our, sharing, in part, the experience too, if not the cash. Of course our responses are seldom so clean; however, preference for the pal’s windfall, over a personal gain of, say, £50 should fail to surface in but a few. Exchange £50 for £1000, though, what then?

The altruistic gene, extent of friendship, of course, will influence the almost inevitably discounted value we store on someone else’s gain over an equivalent personal one. Arguably a similar sense of discounted empathy is present when contemplating rewarding our future selves. Confirmation, or knowledge, of future rewards doesn’t benefit the present, as it does the future-present. The nearer the reward, typically, the greater the anticipation, the better the ‘now-experience’ - the less we discount it. As with the linked-example a similar present-satisfaction level is predicted, now, at the prospect of the equivalent fixed-rewards in either 5 or 6 years’ time. Thus they’re comparably discounted, valued similarly. Inevitably, therefore, doubling the latter date’s reward, as in the linked example, trivialises the choice. So in other words, the sooner the reward, the more “me” is benefiting, the further away, the more it is someone else’s gain: the future-me. Somewhat off-track, and besides, there are other discounting drivers such as circumstance, risk.

Waiting for pain seems unequivocally bad; pleasure delays appear somewhat nebulous - particularly when pleasure banishes a negative state. However, it seems, in general, we choose instant gratification - which often won’t bind to value, or arguably preference. Rewarding or not, delaying pleasure, holds less impact than dread-induced pain-delay.

The poker tie-in is an obvious one. Players predisposed to inflict protracted delays on their adversaries before fixing on a decision often qualify their actions as ones geared to elicit information, or to guarantee a measured decision; often, though, it is designed solely to factor in dread into competitor’s future decision-making. However, waiting on an opponent’s play is in theory a mixture of both states: anticipatory pleasure, anticipatory pain. Uncertainty, though, a cost not mentioned so far, will typically mitigate pleasure and augment pain. For example, in the above shock experiment one could easily imagine a shock ‘sometime in the next minute’ to be less preferable than one at precisely 30 seconds. Indeed, one might expect uncertainty to be traded out for shock increases, as delays were.

The game or hand-specifics, of course will determine the mix of anticipatory emotions. Anticipatory pleasure appears drowned by anticipatory dread (& ~ regret) when enduring the uncertainty accompanying a big bluff. It makes sense: the bluff-state is, typically, emotionally, net-negative, since to not bluff is to suffer losing. As such the design of bluffing is often to gain a less (expected) net-negative state, rather than move to a positive one – so it should seldom feel great. In reality of course the effect of our bluffing on our emotions is not so calculated. We are often inclined to bluff for the wrong reasons, notably, loss aversion, which in fact would lead to a poorer (expected) state. Anticipatory-pleasure should perhaps surface, or even dominate, in free-roll situations or on occasion of a pot wrapped up. Unfortunately, it rarely feels so tangible: it seems we bank the gain, and unfulfilled gains result in disappointment. The lack of emotional-symmetry between gain/loss states (biasing anticipatory dread over pleasure), the discomfort of uncertainty, testify to a tough time waiting.

If one accepts a player is damaged though dread, damaged by more than just the fact, but of its anticipation then ethical questions of a sort are bound to arise. The claim that to bear, tolerate, endure, mask, even mutate such suffering, is a requisite part of players’ armour, (as to inflict is of the arsenal) is legitimate. However, it is trivial, and often, cost free to do so. In live play it is mostly unregulated; players are generally only restricted within the hand. While some target sufferance selectively, others initiate a catchall strategy: sweating a guy out at every opportunity, guarantees sufferance under any weak position. While admittedly, those allocating delays selectively will inflict more angst, since the threat is weightier, the distorted perceptions under such circumstances doubtlessly leads one to conclude, and experience, the catchall play to elicit the greater net anticipatory dread. Anticipatory dread, of course, is the design of this, at times, tedium.

In blackjack, circumstances frequently surface where emotions bait the knowledgeable player into falling out of line; however, the prevailing sense of ‘but I know this is the right decision’, will usually suffice for all but the most marginal of decisions. Hence placating emotions via a compromised strategy is seldom facilitated: the two remained partitioned. [1]. In poker, such defences are easily breached. Situations are unique, measurements subjective, hands can be played in a mixture of ways. Consequently, without a definite rebuttal to hand, emotions gain easy access to decision-making, and so in seeking out plausible strategies to reduce, or minimise, the emotional cost, they corrupt the decision-making process. Anticipatory dread is just one, rather powerful emotion, the mind is eager to dump anyway it can.

Once again, though it is rightly viewed as a specific instance, of the far greater challenge of managing our emotions in poker, the test while appropriate should be even-handed. For purposes of practicality, fairness and skill, the resource should be restricted over an interval, with individuals left to consider how best to allocate their resource (as it is on-line).

-----------------------------------------------------------------------------------------------------------------

On-line poker rooms should be fully tuned in to the cost, or value of inserting time delays.

Life for the multi-tabler can be exhaustive, keeping track of all stacks at any given time, is extremely tough: it is easy to be a pot or two out on an estimate on any one table. Years ago while playing regularly at both and party and stars, I observed a marked difference between the sites' respective feel-good factors. No, not the softness differential, but rather, the emotional reflex upon winning a pot. Glancing post-hand on party I’d be gratified by seeing the cash-total update, on stars it’d already updated. Naturally, I, the Pavlov dog, half-salivated when the bell rung: partly expecting the stack size to remain unchanged, partly to increment. Of course, I knew which site was which, but not in time to catch the reflex emotion. Needless to say, I’d experience a fleeting burst of pleasure/disappointment depending when the roll was updated. Significantly, no flipside or downside exists to the inserted delay – neither site deducted losses at the end of the hand, so no complementary, deflating, ‘delayed loss’ at party.

In the early years, PokerStars’ reputation for tournaments and big bet cash-games soared, as did their notoriety for killer rivers: the luckless rapidly coined the site, RiverStars.

Defeat snatched from the jaws of victory is the bitterest pill: virulent when administered by no-limit poker. Death in limit comes by a thousand cuts, a mere handful of meaty blows will suffice to slay the NL-victim; as such, those beats persist in the memory, not recollected as some hazy nightmare, as is often the case with disastrous limit sessions.

In a bid, it seems, to retain the authenticity and drama of live poker, stars, when no more action could take place in a hand, turn the cards on their backs. In addition, they injected a palpable, dramatic delay, between each chapter - every street. In limit-cash, at the time, the mainstay of other sites, this wouldn’t occur, players were seldom all-in, and when they were, (certainly now) hands were only revealed, or mucked, once the river was revealed, concluding the hand. These delays augment the torments of counterfactual regret and anticipatory dread: emotional investment implicit to waiting, knowing what needs to be avoided for just that card, is considerable, defeat, therefore, is all the more crushing. With cards face-down, the intensity, dread of the river is generally less – seldom are miracle rivers apparent in ignorance of an adversaries holding – your aces are vulnerable to every street.

When cards are dealt in one swoop, however, there is little time for anticipation, the beginning, middle and end are fuzzily defined; as such, an abrupt defeat engenders a less vivid counterfactual reality of winning - with all hurdles to be vaulted at once – in contrast, say, to one street and four outs standing before victory. In addition, of course, there is the aforementioned anticipatory dread: lengthier delays increase dread. The purported lack of symmetry between experiences in delays of favourable and unfavourable events, allied to general emotional-aversion to uncertainty, infers such injections unwise: unnecessary accretion of negative experiences hardly value-adds to the brand.

So in short, stars, it seemed, implicitly ensured PokerStars rivers scarred the most.

Next article

[1] Incidentally, delays from the croupier in blackjack frustrate the majority of punters, accompanied losses seem to torment the most.

Monday, February 11, 2008

mindset (ii)

London buses indeed.

Consider once again the performance drivers: playing well; playing to win [1]. On inspection, one might view them interchangeable, synonymous: a player driven to play well will win, anyone driven to win must play well.

Logical cracks surface, though, under light analysis: winning is a defined, unambiguous state; ‘playing well’ is non-absolute, typically subjective and often relative. Winning is an effect; decisions are causes.

The drivers, while correlated, are distinct - it is not a purely causal relationship. Winning is a state often achieved playing poorly; moreover, playing well doesn’t guarantee winning. Variance is the chief culprit, but also both in the subjective and absolute ways performance is typically measured. Ordinarily, playing your best game (the subjective) or even playing a ‘great game’ (the absolute) will be sub-standard versus top-class opposition. Conversely, antithetic, contrasting, performances might suffice against weak adversaries.

Additionally, winning, or being a winning player, in like passing an exam, is qualitative: standard-of-play metrics, however, confer grading. Consequently, the win-driver relents, runs out of steam, somewhat, when attaining the win-state; the play-well driver, though, is the duracell-bunny.

Of course it is reckless to pay heed only to the playing-well driver: playing better, than our opponents, is pretty important too. Since the subjectivity inherent in measuring personal performance requires intermittent, at least, objective validation, we inevitably become sucked in by the winning-driver: it’s not profitable playing well with superior opponents. We must care about winning too [1].

The following hypothetical gambits contrast the mindsets.

A bursary of $2000 is awarded for attaining some goal over a six-hour session of $5-10 limit hold’em. In the first instance only a profitable session is required; in the second, the sum is awarded if the performance is deemed accomplished over the duration.

In the latter case the focus is clear: play great poker. Against the backdrop of a $2000 bonus, who cares if you win or lose? The object is to execute credit-earning decisions; perhaps a check-fold, bluff check-raise on the river, slow-playing AA pre-flop etc. Sure decision-making is tough, but there is only one agenda; it’s about as uncluttered as poker gets.

In the first scenario, right from the off, we sweat the balance sheet. Winning attracts a preservation-mindset, losing switches on hunt-mode: the former leads to conservative plays, the latter to lines of high variance. At times, rightly so: an increased chance of landing the 2k prize might easily compensate expected losses suffered during a hand. Inevitably, though, added complexity and pressure leads to overly conservative/risky strategies – especially when time runs out on an uncertain outcome.

But so what, we’re discussing a hypothetical, unrealistic, more complex and seemingly pointless problem. The scenarios, though, albeit hypothetically, inject reward, value into the decision-making process - it just so happens to be fiscal. Which, in theory, facilitates analysis: resulting, in the first case, to purported technical adjustments.

Unfortunately, added incentives are not fiscal. Daniel Kahneman and Nobel prize winner Amos Tversky reasoned people were, typically, loss-averse. They were not merely affirming the barefaced truth the populous dislike losing; rather, they, inferred the loss-state bore an additional cost, beyond the loss itself - derived from countless examples of consistent economically irrational decision-making. Most of us for example, experience a greater emotional differential between $50 win/lose states, than between, say, gains of $50 and $150; unless the sums involved are critical, in some way, this appears irrational.

Loss-states aren’t necessarily zero-sum: gain-states frequently occur in life without a complimentary loss-state (& vice versa); in poker, few feel as replenished in victory as they do damaged by defeat [2]. Poker, especially, tournament-poker, seductively potrays an image of fair competition; losing, therefore, inevitably baits underachieving sentiments - a natural, impulsive but somewhat imprudent response, since no two players sit the same exam. Another candidate meta-cost is the losing process; in poker, unlike most gambling, when you lose, for example, you lose to someone[3]. So we are averse to the tangible-loss, the loss-state, competitive-failure, the losing process.

We continuously credit and debit these and other mental accounts, though, they trade in different currencies - if play-well is in credit, so what if the ego-account is in the red? That's fine: as long as we don't hold the exchange rates. Trade-off, though, as is done routinely in day-to-day multi-criteria decisions, we compromise our play. Money is quite meaningless without emotion; a prospective purchase is seldom contemplated without assimilating the emotional pay-off. So unfortunately, the exchange-rates may already be in place, or could be sensibly extrapolated. That said, decisions in poker frequently surface under a cloud of emotion, not typical, seldom present in day to day financially-oriented decisions.

Next article

[1] In this context ‘playing well’ represents a drive to execute good decisions, not to put up a ‘decent performance’. So the driver could easily be to ‘play perfectly’, ‘your A game’ etc

[2] That is w.r.t symmetric states (win/lose $500); rather than, naturally, in tournaments.

[3] which you’d suspect may in part, be offset by winning off someone, but, arguably, not equitably.

Friday, February 08, 2008

mindset (i)

Consider the following two attitudes:

to win;

to play well;

Sports’ coaches and their ilk would reckon to steer a truck between the pair. The tradition of fair play allied with putting up ‘a spirited fight’, supposedly endemic to the British psyche, is routinely cited as the anathema of the nation’s sport.

‘Winning ugly’ was doubtless schooled into hard-knocks poker grads long before the intense Brad Gilbert coined the phrase. An object lesson in the above mindset-distinction should seldom be required in poker: few competitive environments exist where the glorious failure oxymoron is less self-evident. Nevertheless, some players – most of us to an extent – will opt to lose, or certainly profit less, than adopt a counter-machismo style.

In a sport as tennis a winning strategy, in theory, is self-similar at all levels. In other words, in order to maximise the likelihood of winning a set, one must aspire to win every game; likewise, the requirement is on each point to win games. Naturally, meta-issues surface: injury or fatigue will occasionally necessitate the submissive relinquishing of a forlorn set, or game; shots deemed ‘low percentage’ will serve as loss leaders from time to time. Those issues aside, the game is in structure, strategically self-similar. However, many sports witness combatants purposefully adopt localised losing-strategies in order to protect leads or chase victories (and so not follow self-similarity) [1].

In poker, the seductive strategies tend not to be self-similar: opting to maximise potential profit in a hand will not maximise the year’s potential profit [2]; minimising short-term risk, will not minimise long-term risk [3]; lines optimising the chances of winning individual pots, will rarely return the best chance of winning over a clutch of hands let alone a session, or year. Material meta-issues aside, a suitably bankrolled guy maximising EV from each pot, though, will maximise EV over the session, a year [4]. So what’s the problem with EV?

Now, with the gymnast there’s no reward for deviation, no punishment for accuracy. There is determinism: better execution will reflect in better marks; improve one element - improve the whole (ceteris paribus). However, in such sports as golf, darts and snooker the penalisation of superior accuracy and/or technique, abounds. But not sufficient to confuse, mask their overall benefit [5]. Nevertheless, as a consequence, the random walks of these sporting-events witness wrong steps move in the right direction and vice versa.

As we move to more game-theoretic and high-impact low-probability event-driven sports, confusion and uncertainty often rein free [6]. As such stakeholders (interested parties) seek frameworks within which to operate, rules to follow, conventions to reinforce, superstitions to suffer and so on. All in a bid, it seems to: bring order; certainty; mitigate regret, in particular, counterfactual regret. The ease with which one can visualise or imagine not losing in the manner lost, will determine to a large extent the level of regret experienced in losing said way. As such stakeholders, decision-makers, will anticipate contrasting levels of counterfactual regret with candidate options, and so attach to them varying levels of anticipatory regret (ex ante); this regret will doubtlessly filter (or attempt to) into the decision-making process.

Put simply, if doing that which would’ve won the game was something seldom done, there should be little cause for regret; conversely, if failing to do the very thing which is done routinely causes losing, well, you’ll be immersed in it. Naturally, this is (typically) known in advance of the decision being taken, or event occurring, and so the emotions are anticipated and so, inevitably, affect. It is therefore, arguably a strong mind for which a great number of decisions are in-play (Jose Mourinho, tactical substitutions after 20 minutes).

In this class of sport, any (interesting) two situations are seldom the same and similar ones will often be too infrequent to draw meaningful conclusions. As such learning is difficult; a move towards a better strategy might not realise immediate or obvious benefit (expected or otherwise). If the strategy breaks convention, and, or, upon failure, increases the level of counterfactual regret suffered by stakeholders, the expected gain could well be overshadowed by the personal risk now incumbent on the decision-maker.

Revered ex-England cricket captain, Mike Brearly, observed a marked difference in the recrimination imparted on a captain’s retrospectively poor toss-decision by the type of failed decision [7]. When the batting team fails in the first innings they are seen as disadvantaged but very much in the game: they could still win. However when the batting side succeeds - the bowling team concedes many runs, but a few wickets at stumps on day 1 - the bowling side are deemed to be batted out of the game: they can hope for but a draw. So, it appears, to suffer a disadvantage having electing to bowl is more lamentable than a comparable fate after opting to bat.

This intuitive feel-good logic is attractive and rational, but hardly rigorously analyzed. It is somewhat akin to such poker rationale as: ‘if you’re not in the game on day 2 of the WSOP, you can’t win: make sure you’re there - you got to be in it, to win it’. Nice sentiment, but optimal? No, not likely. Batting first effectively puts a cap on how bad things are at the close of play; but does not to testify as to how good things, or importantly, how things are likely, or expected, to be. So the cricket captain battles both convention and anticipatory regret [8].

A footballer’s (soccer) missed scoring opportunity is typically less regrettable when ‘working the keeper’ than on occasion of missing outright; so a strategy, or adjustment, boasting a marginal improvement in scoring, but material hike in misses, could, under the sufferance of criticism and/or self-recrimination, become distorted and feel less, not more, effective. Indeed, for some, it would be self-fulfilling, as confidence wanes under the blame-burden. Still even with a clear mind, issues of personal-risk, aversion to criticism and blame might adjudicate maintaining the status quo the judicious choice.

Coaches often, or are inclined to, alter tactics, personnel if the game-state is undesirable. Once again there is a regret issue, to not change, is to not try (“do something!”); few upon failure, will grumble ‘you’ll never know what would have happened had you not made those changes’ [9]. Of course the game-dynamic is typically different and so change is legitimate. However, if, post-change, the team dominates possession, creates a multitude of chances, but fails in the end-goal, one might legitimately state winning to have been a ‘close-call counterfactual reality’: they nearly did it.

However, it is a leap to conclude the team were benefitted (in an expected sense or otherwise) since the performance improved post-intervention. Perhaps the decision was fortuitous, rather than astute; or indeed, the game-state suggested any change would trigger improvement [10]; in addition, games can and do progress organically, so the non-interventionist counterfactual reality may outperform the interventionist one. Nevertheless, informed decisions should correlate with favourable “factuals” (outcomes), as they should with positive close-call counterfactuals (nearly good outcomes), and, naturally, poor decisions map to negative close-call counterfactuals (& factuals). So the decision-maker, in theory, is better equipped, more advised, when cognisant of both: the factual and the close-call counter-factual.

In such environments marginal changes do not have marginal effects - where it matters. Which, as suggested, hinders learning; however, at least the football coach is availed discriminating metrics, beyond the score line, with which to measure the impact of change.

Instead consider the poker player pondering the big-bet call-down. There are no close-call counterfactuals here; it’s binary – miss or monster. No bluff-percentages accompany the showed hand, nothing to help realign a wayward strategy, save a potentially misleading showdown. Decisions typically feel a good deal less marginal ex-post, than they do ex-ante (assuming hand disclosure); which rather worryingly suggests a somewhat selective or perhaps suppressed world-view, either before or after the fact.

So learning through EV is tough: decisions are rarely awarded the amounts ‘expected’ (the feedback (profit or loss), therefore, invariably distorts); meta-issues abound - the value, or cost radiates over a multitude of hands; scenarios exist without the edifying close call counterfactuals.

The idea is, of course, to learn over the long haul. Unfortunately, brain-accountancy is typically shoddy, the database unreliable and lacking in the necessary detail. What’s more, lessons are scattered, not learnt in chunks. Akin, perhaps, to hopping from one language to the next, after each new word, facet of grammar understood.

So, we budding players have a torrid time; mind you, not that I’d swap the monitor and mouse, for the high-bars and rings.

Next article: part ii

[1] Say, the very defensive golf-shot of the tournament leader at the 18th; the advancing goalkeeper of a trailing football (soccer) side.

[2] Note this is to survey the possible lines and select the option corresponding to the greatest attainable reward (eg raising, rather than a sensible call hoping, by chance, to be called by a weaker hand). While potential for a mother-of-all-runs could technically be viewed as technically self-similar, it is a somewhat facile point.

[4] Over-cautious strategies mitigating the risk of busting out of a game, may, through loss of EV, increase the chances of busting the bankroll.

[5] Although of course, there is often considerable uncertainty over which technique generates the best results: a golfer reinventing a swing. So there is technique and execution; which applies to poker too, at a high level.

[6] So for example, in football (soccer) a goal is a very infrequent event during a match, compared, to say, a pass, or a free-kick, but of course has very high impact since goals absolutely determine success or failure.

[7] Cricket, specifically, is a game played over up to 5 days, where a team must bowl out 10 batsmen from the opposition twice in order to win; if both teams achieve this then the team scoring the highest runs wins; if neither are able to do so, the game is a draw. Therefore, it’s hard to envisage a side losing if they’ve scored a lot of runs for the loss of a few wickets (batsmen) during their first innings on day 1.

[8] They are clearly not independent, and doubtless reinforce each other.

[9] Except, when of course changes were instigated from a favourable, winning, position.

[10] A so-called secondary counterfactual – at least in a qualitative sense. Although the specifics of the improvement effected could only occur with that explicit change, it might be argued, a number of alternate alterations would have advanced matters uniquely, too. So improvement upon change was virtually inevitable.

Thursday, February 07, 2008

sklansky's theorem: fundamentally flawed (iv)

Though entirely proper to test, disprove, any theorem, perhaps, the most pertinent inquiry of Skalnsky’s is to ask: is it useful?

The general interpretation or use of the theorem appears in the validation of decisions through hand playbacks with opponents’ cards face up: if performed the same way, the boy done good.

I’ve personally found the process to placate rather than inform. After bad-beats, I’ve consoled myself with: ‘well I’d have played the same if I’d known what he had’. Such thinking is quixotic, since, although the strategy, face-up, might be seen as perfect; face-down, it could be dire. As poker players, we seldom deal with certainties.

On occasions where a decision is marked retrospectively correct in this contrived manner, but on the (judgement-based) balance of probabilities at the time, deemed wrong, applying hindsight-analysis will likely mislead and so hinder learning.

A conflict, of sorts, arises when we witness a respected, more informed player deviate from our selected path or option. We are, instinctively, compelled to re-evaluate, coaxed, perhaps, into presuming we were in error and tempted to mimic (or migrate towards) the play [1]. However, this can resemble the above trap - his balance of probabilities, worldview (not to mention image) contrasts our own; a superior player is found, typically, to be better informed, in the same sense, but to a lesser to degree, as someone privy to hole cards: if our tools, skills, are different then so must be our models, and our answers [2].

Consider an amateur weatherman using an historical 3-day forecast algorithm [3]; light showers are predicted, at a low-confidence level, for the following morning. The amateur weatherman baulks at the prediction and forecasts a dry day. Later that evening, after submitting his prediction, he listens out for the Markov Amateur-Weathermen Society’s forecast, which utilises a sophisticated 10-day algorithm: a clear day is predicted – with a high level of confidence. The day was indeed free of rain. So the amateur's forecast was correct, but the decision? With better tools, more information, the weatherman will, perhaps, claim to have reached such a forecast anyway; however, contradicting his basic algorithm, whimsically, is clearly gambling against the odds, which, ultimately, is a strategy, destined to under-perform.

Of course, inevitably, as we develop, the cart will, on occasion, precede the horse - we are seduced into reproducing the effects (plays) of better models, so our model resembles it, appears similar - to kid ourselves we’re improving; nevertheless, in the poker world, one might argue extrapolating this way, persisting in (based on the current model) erroneous decisions, may render a speedier, albeit costlier, progression - at least for some.

To conclude, it is, surely, advantageous to measure our decisions based on our interpretation of the information gathered, not what wasn’t, or couldn’t. The value in ascertaining the correctness of a decision based on a near utopian-level of awareness is far from apparent.

Next article

[1] Which of course can edify (& so develop the model), but also regress if applied without the necessary insight. E.g. value-betting weak hands.

[2] Obviously, not on every occasion.

[3] i.e. it uses the previous 3-days weather to predict the next.

Wednesday, April 25, 2007

sklansky's theorem: fundamentally flawed (iii)

‘The Fundamental Theorem applies universally when a hand has been reduced to a contest between you and a single opponent’ - Sklansky.

Though hardly churlish to contest on utility grounds, statements free from individual preference-states often illuminate and edify while those suffocated with caveats at times confuse.

There appears little value in generating chip-theories, so ‘gain’ is reasoned to mean the bottom-line, cash-EV (EV). Sklansky’s statement, though, is naïve in its universal acclamation as it is predicated on EV flowing only between active players, which is patently untrue: chips do, but not always EV.

Players found enduring tournaments will invariably experience material emotions in the decisions, or outcomes, of hands in which they appear uninvolved. Unsurprising, since the odds of securing any given payout fluctuate, wildly at times, for all players, with every decision or turn of a card (pot-active or not). Consequently, tournaments-hands should seldom, if ever, be considered zero-sum between only those active in the pot, not even, as Sklansky claims, when reduced (the pot) to headsup (HU). At each junction, EV might flow in or out of the hand: it isn't contained. Therefore, where tournament-dynamics permit, the still pot-active players could each lose EV were a specific action engaged (or passively gain when avoided).

Two big and two small stacks, only, survive the tournament. The minnows give deference and fold; the chipped-up small-blind elects to push against his rival’s big-blind. Now, said rival has a read; he adjudges, with certainty, his adversary to hold either JJ or AK. Glancing down he discovers QQ. Gulp. The short-stacks, inevitably, are not passive on-lookers: they long for a calling big-blind. While ongoing events will impact on neither’s chance of winning outright (materially), the opportunity of securing a higher prize would present itself should these stacks collide.

The big-blind is aware that calling will see them both shed EV to the now salivating small stacks. However, the Queen's are a favourite over each of the small-blind's potential holdings - they hold an edge over the likelier AK, but dominate JJ. After due consideration, the (expected) pay-off from the SB is deemed to be sufficient compensation for his contribution to the EV-drain: he's calling.

Just as he committs, his opponent’s cards are accidentally disclosed: AK is shown. Despite still gaining EV from the SB, now without the luxury of a chance domination of JJ the the SB's payout no longer adequately covers his loss to the small stacks. So he passes.

Now since he, rightly, changes his mind, by Sklansky’s Theorem he will gain. Also, the small-blind, by definition of the theorem, must lose-out from this redress. Except he doesn’t, he gains, in fact, more so than the QQ since he was losing EV to everyone. It is the small stacks who lose out; Sklansky’s Fundamental theorem does not hold.

The example is, admittedly, extreme; however, whenever it is possible for EV to flow out of the hand, which is headsup, Sklansky’s claim is under threat since it isn’t a ‘zero-sum game’ w.r.t. the two combatants.

In tournament poker, passive gains and losses abound from other tables - a skilful player is at risk, a short stack created. While on the player's own table, a collision, say, resulting in the emergence of a threatening stack to the right, at the expense of one to the left, is typically advantageous. Gains, passive or otherwise, must accrue from somewhere, vacate someone's EV, whether it be it from active-players or, indeed, from those passively, negatively, affected (e.g. the player now at risk from the big stack on his left). Nevertheless, when one or more inactive players gain passively during a HU pot, both active players might, in theory, net-lose on the decision, to foot the passive-gain bill. Consequently, should either become cognisant of the other’s holding, both might benefit from a revised decision. So, the theorem doesn’t apply universally 'when a hand has been reduced to a contest between you and a single opponent'.

Passive gains exist in cash games too; furthermore, once gains w.r.t utility are included the failure-space increases. Sklansky’s claim holds only if measuring gain w.r.t chips and if the future-impact of their redistribution is excluded, in so doing, supporting uninformed and potentially erroneous decisions.


Part (iv) to follow.

Sunday, February 11, 2007

sklansky's theorem: fundamentally flawed (ii)

The late Andy Morton cleverly disproved Sklansky’s theorem in the late 90’s. Morton’s Theorem, as it became known, is markedly more analytic and conventional than the softer rebuttals expressed in part (i). In multi-way pots, he showed, situations occur often where a player undertakes a decision benefiting both himself and an opponent, contradicting Sklansky’s Theorem. Were a player, say, to gain, in the EV sense, from you but leak a greater sum to another, he’d be right to fold: and you‘d want him too.

Example:

Limit Holdem: The board shows: 8h, 10h, 4c, 6s:

Player 1 holds Ad-Ah
Player 2 holds: Jh-Qh
Player 3 holds: Kc-10c

Transparency necessitates an assumption of zero implied odds with the best hand winning on the river. Player 2’s (P2) chances of winning can’t be diminished from a call by Player 3 (P3), as such he can only benefit through winning, potentially, a greater pot. Player1 (P1), though, is ambivalent over this decision, since, like P2 when he wins he gains an extra bet; however, unlike P2 his chances of securing victory are reduced by such a call: there is trade-off.

Suppose P3 receives the precise odds to call. Our, model player 3 should, thus, be indifferent between folding and calling. However, P2 gains from, and thus hopes for his call; so since P3 is indifferent to his own call, then by deduction it is the Aces accounting for, and so bearing the cost of, P2’s potentially improved position. Although not strictly illustrating Morton’s case, since P3 didn’t gain by folding, it does for all intended purposes - Sklansky’s theorem suggests if your opponent is indifferent to calling, then so are you. Clearly, that’s not always the case.

Now, for a slightly more numerical approach: as P3 deliberates, one might view the pot as jointly owned by his adversaries already committed to the river. The dilemma for player 3 is whether or not to join the party. Assume P1 owns 70% and P2 30% of a pot currently standing at $800. For simplicity assume forty cards remain, of which, just four land player 3 the spoils. So, evidently, he is a 9-1 shot receiving odds of only 8-1. In this case, as above, P2’s prospects of winning are unaffected by his successor’s decision, still 30%. Should P3 elect to call, he and the front-runner, P1, could, through an award of 30% of the pot, legitimately settle-up with player 2 and proceed to strip-out or void P2’s 12 outs, with the river allocating the remainder.

With P3 folding, P1 can expect a return of 70% of $800: $560. But with a call, and P2 settling out-of-river, what now P1’s reward? The depleted deck holds just 28 cards from which P1 draws to all but 4 for the reduced pot of $630 ($900-$270). Player 3’s expected return from the $100 turn-investment is 4/28 * $630 = $90. As expected, a losing investment: he’d have been better off folding. Player 1’s equity is also reduced, by $20 to $540 ($630-$90), thus both P1 and P3 are better off if P3 elects to fold.







Unfortunately, the final example testifies, somewhat inevitably, to the existence of subtle and almost improvable degrees of collusion in our game. Here, P3’s $10 loss is the result of a $20 credit from P1 and a $30 debit to P2; of course, as colluders, the drain on EV to P2 is an illusion. Naturally, no such explicit case will occur, seldom will the two cohorts be certain of their single foe’s holding; however, clear folds become marginal ones, marginal folds become clear calls and so on. All the while there is little hint of cheating: simply a localised increase in the frequency of bad-beats by weak(ish) calls.

For a fuller and more mathematical treatment it is fitting to visit Morton’s original post.

Morton’s contends occurrences are more frequent than Sklansky's ‘rare exceptions’ ; which appears rational, given the non-exceptional situations described.

Still Sklansky affords no concession for Heads-Up pots (one-on-one), stating: ‘The Fundamental Theorem applies universally when a hand has been reduced to a contest between you and a single opponent’.

Next article: part (iii)