Game Theory Cheat Sheet
Game theory is the study of mathematical models of conflict and cooperation between intelligent rational
decision-makers.
Game theory is mainly used in economics, political science, and psychology, as well as logic, computer science and
biology.
Players: who are the decision makers?
Actions: what can the players do?
Payoffs: what motivates players?
Stay silent | Betray | |
---|---|---|
Stay silent | -1,-1 | -3,0 |
Betray | 0,-3 | -2,-2 |
Positive affine transformation: au + b, where a > 0
and b is any real number. Expected utilities are identical to positive affine transformations.
Simultaneous games (a.k.a. Strategy games) - games where both players move simultaneously, or if they do Usually normal form is used to represent simultaneous games. | Sequential games (a.k.a. Extensive game) - games where later players have some knowledge about earlier Usually extensive form is used to represent sequential games. |
Cooperative games - games where the players are able to form binding commitments externally enforced (e.g. through contract law). | Non-cooperative games - games where players cannot form alliances or if all agreements need to be self-enforcing (e.g. through credible threats). |
Zero-sum games - games in which each participant’s gain or loss of utility is exactly balanced by the losses or gains of the utility of the other participants. | Non-zero-sum games - games in which the interacting parties’ aggregate gains and losses can be less than or more than zero. |
Perfect information games - games in which all players know the moves previously made by all other players. | Imperfect information games - games in which some players don’t know the moves previously made by other players. |
Complete information games - games in which all players know the strategies and payoffs available to the other players. | Incomplete information games - games in which some players don’t know the strategies or payoffs available to the other players. |
Finite games - games that last for finite number of moves. | Infinite games - games that last for infinite number of moves. |
Stay silent | Betray | |
---|---|---|
Stay silent | -1,-1 | -3,0 |
Betray | 0,-3 | -2,-2 |
Play heads | Play tails | |
---|---|---|
Play heads | 1,-1 | -1,1 |
Play tails | -1,1 | 1,-1 |
Go to opera | Go to football | |
---|---|---|
Go to opera | 3,2 | 0,0 |
Go to football | 0,0 | 2,3 |
Hunt stag | Hunt hare | |
---|---|---|
Hunt stag | 2,2 | 0,1 |
Hunt hare | 1,0 | 1,1 |
Go to Party | Stay Home | |
---|---|---|
Go to Party | 10,10 | 0,0 |
Stay Home | 0,0 | 5,5 |
Play heads | Play tails | |
---|---|---|
Play heads | 1,1 | 0,0 |
Play tails | 0,0 | 0,0 |
Swerve | Straight | |
---|---|---|
Swerve | 0,0 | -1,1 |
Straight | 1,-1 | -10,-10 |
Nash Equilibrium - a set of strategies, one for each player, such that no player has incentive to change
his strategy given what the other players are doing.
Nash Equilibrium (alternative definition) - a set of strategies, one for each player, such that every player’s
strategy is the best response to what the other players are doing.
Best Response - a strategy such that a player cannot gain more utility from switching to a different strategy,
given what all other players are doing.
Mixed Strategy - a probability distribution over two or more pure strategies, that is, the players choose
randomly among their options in equilibrium.
Mixed Strategy Nash Equilibrium - a set of mixed strategies, one for each player, such that no player has
incentive to change his strategy given what the other players are doing.
Dominant Strategy - a strategy that is always better than any other strategy, for any profile of other players’
actions.
Dominant Strategy Nash Equilibrium (equilibrium in dominant strategies) - a Nash equilibrium in which all strategies
are strictly dominant. If it exists can be found by elimination of strictly dominated strategies.
Dominated Strategy - a strategy, such that, regardless of what any other players do, the strategy earns a player
a smaller payoff than some other strategy.
The Core (analogous to Nash equilibrium for coalitional games)
Nash equilibrium for different types of games:
Simultaneous | Sequential | |
---|---|---|
Complete | Nash | Subgame Perfect Nash |
Incomplete | Bayesian Nash | Perfect Bayesian Nash |
Every finite, non-cooperative game of two or more players has a mixed strategy Nash equilibrium. (John Nash, 1950)
Pareto-optimal Outcome - an outcome, such that there is no other outcome that Pareto-dominates it. An outcome o
Pareto-dominates outcome o'
if it’s at least as good for every player as outcome o'
, and there is some player
who strictly prefers o
to o'
.
Almost all finite games have a finite number of solutions, and that number is also odd. (Robert Wilson, 1971)
All players know the moves previously made by all other players.
Every game in extensive form can be converted into normal form. The reverse transformation is not always possible,
e.g. matching pennies cannot be written as a perfect information extensive form game.
Every perfect information game in extensive form has a pure strategy Nash equilibrium.
Player 2 doesn’t know the move made by Player 1.
Backward Induction - identify the equilibria in the bottom-most trees, and adopt these as one moves up the tree.
Subgame Perfect Equilibrium - Nash equilibrium that represents a Nash equilibrium of every subgame in the original
game. It’s a refinement of the Nash equilibrium that eliminates non-credible threats.
Non-credible Threat - a threat made by a player in an extensive form game which would not be
in the best interest for the player to carry out. The hope is that the threat is believed in which case there is
no need to carry it out. While Nash equilibria may depend on non-credible threats, Backward Induction eliminates them.
Repeated Game - an extensive form game that consists of a number of repetitions of some base game, called
a stage game. The stage game is usually one of the well-studied 2-person games.
Discount Factor - a number between 0 and 1 that represents the time value of consumption and probability of
continuation. A higher discount factor means more patience and higher chance of surviving into the next period.
The One-Shot Deviation Principle - in finite or infinitely repeated games with discounting, a set of strategies
is a subgame perfect equilibrium iff no player can profitably deviate from his strategy at a single stage and maintain
his strategy everywhere else.
Grim Trigger
The strategy profile where everyone plays grim trigger is a subgame perfect equilibrium.
Tit-for-Tat - a strategy in the infinitely repeated prisoner’s dilemma:
The strategy profile where everyone plays tit-for-tat is not a subgame perfect equilibrium.
Implications of the folk theorem:
Stochastic Game - a generalization of repeated games.
Fictitious Play: each player maintains explicit belief about the other players.
No-regret Learning:
Bayesian Game (Incomplete Information Game) - a game in which players have incomplete information on the other
players’ strategies and payoffs, but, they have beliefs with known probabilities. It can be modeled as a normal form
game with the difference that each player has multiple types with known probabilities (called a common prior beliefs).
Bayes-Nash Equilibrium (Bayesian Nash Equilibrium) - a set of strategies, one for each type of player, such that no
type has incentive to change his or her strategy given the beliefs about the types and what the other types are doing.
3 stages of a Bayesian game:
Ex-ante Dominated Strategy - a strategy for a player such that an alternative strategy for that player provides a
greater payoff for that player regardless of all other players’ strategies
Interim Dominated Strategy - a strategy for a type such that an alternative strategy for that type provides a
greater payoff for that type regardless of all other players’ strategies
Interim dominated strategy implies ex-ante dominated strategies. The reverse is not always true.
Almost all mixed strategy Nash equilibria in a complete information game are the limit of pure strategy Bayesian Nash
equilibria in an incomplete information game that converges to the complete information game.
Such games are called Bayesian because of the probabilistic analysis inherent in the game. Players have initial beliefs
about the type of each player (where a belief is a probability distribution over the possible types for a player) and
can update their beliefs according to Bayes’ rule as play takes place in the game, i.e. the belief a player holds about
another player’s type might change on the basis of the actions they have played.
Bayes’ Theorem
P(A|B) = P(B|A) * P (A) / P(B), where
Bayes’ Rule
Bayes’ theorem in odds form is:
O(A1:A2|B) = O(A1:A2) * Λ(A1:A2|B), where
Λ(A1:A2|B) = P(B|A1) / P(B|A2) - Bayes factor or likelihood ratio, and
O(A1:A2) = P(A1) / P(A2) - odds between to events.
So the rule says that the posterior odds are the prior odds times the Bayes factor,
or in other words, posterior is proportional to prior times likelihood.
Coalitional Game is given by specifying a value for every coalition.
Formally, the coalitional game consists of a finite set of players N, called the grand coalition,
and a characteristic function v: 2 ^ N -> ℝ from the set of all possible coalitions of players to a set of payments
that satisfies v(∅)=0.
Two ways for allocating payoffs:
The Shapley Value allocates the value of a group according to marginal contribution calculations.
where the sum ranges over all |N|! orders R of the players and is the set of
players in N which precede i in the order R.
For any coalitional game, there is a unique payoff division (the Shapley Value) that divides the full payoff of the
grand coalition and
that satisfies the 3 axioms
The Core - the set of payoff vectors under which no coalition has a value greater than the sum of its members’
payoffs. Therefore, no coalition has incentive to leave the grand coalition and receive a larger payoff.
A game is simple if for all coalitions the value of the coalition is either 0 or 1.
A player is a veto player if the value of all coalitions that don’t involve the player is 0.
In a simple game the core is empty iff there is no veto player. If there are veto players, the core consists of all
payoff vectors in which the nonveto players get 0.
A game is convex if its characteristic function v is supermodular:
that is, “the incentives for joining a coalition increase as the coalition grows”.
Every convex game has a nonempty core.
In every convex game, the Shapley value is in the core.
Social Choice Function - a function that, given a set of linear orderings on the outcomes, tells which
outcome should be chosen.
Social Welfare Function - a function that, given a set of linear orderings on the outcomes, tells which
ordering should be chosen.
Voting Schemes:
n - 1
, the next most preferred gets n - 2
, down to the n
thCondorcet winner - an outcome that is preferred to every other outcome in pairwise majority-rule comparison.
It doesn’t exist when there is a Condorcet cycle e.g. a situation when A defeats B, B defeats C and C defeats A.
Condorcet consistency - if there is a Condorcet winner it must be selected by the social choice function.
Any social welfare function over 3 or more outcomes that is Pareto efficient and independent of irrelevant alternatives
is dictatorial. (Kenneth Arrow, 1951)
Social welfare function is Pareto efficient if whenever all agents agree on the ordering of two outcomes, the social
welfare function selects that ordering.
Social welfare function is independent of irrelevant alternatives if the selected ordering between two outcomes
depends only on the relative orderings they are given by the agents.
Social welfare function is dictatorial if there exists a single agent whose preferences always determine the social
ordering.
Any social choice function that is weakly Pareto efficient and monotonic is dictatorial. (definitions are similar
to the corresponding terms for social welfare function)
In advance, decide an ordering of outcomes (e.g. according to left-right political spectrum for political parties)
A group of agents is said to have single-peaked preferences if:
Median voting - the median of the most preferred outcomes is selected.
With median voting a condorcet winner always exists if there is an odd number of voters.
Mechanism Design (also called Inverse Game Theory) - a field in game theory that focuses on designing the game
structure e.g. choosing the actions available to players and mappings of action profiles to outcomes, so as to optimise
for certain qualities e.g. incentive compatibility, Pareto efficiency, individual rationality etc.
Game Setting - components of the game that we, as game designers don’t get to control,
e.g. a set of agents, a set of outcomes, common priors etc.
Mechanism - components of the game, which, when added to a corresponding game setting, turn it into a game.
An example is a set of available actions for agents and mapping of action profiles to outcomes.
Bayesian Game Setting - a tuple (N, O, Θ, p, u):
Mechanism for a Bayesian Game Setting - is a mechanism where the designer gets to specify the action sets for the
agents and the mapping to outcomes, over which agents have utility. Thus it is a pair (A, M), where
Given a Bayesian game setting (N, O, Θ, p, u), a mechanism (A, M) is an implementation in dominant strategies of
a social choice function C (over N and O) if for any vector of utility functions u, the game has an equilibrium
in weakly-dominant strategies, and in any such equilibrium a we have M(a) = C(u)
Given a Bayesian game setting (N, O, Θ, p, u), a mechanism (A, M) is an implementation in Bayes-Nash equilibrium of
a social choice function C (over N and O) if there exists a Bayes-Nash equilibrium of the
Bayesian game (N, A, Θ, p, u), such that for every type profile θ ∊ Θ and every action profile a ∊ A that can arise
given type profile θ in this equilibrium, we have M(a) = C(u(.,θ))
Agents have quasilinear preferences with transferable utility in an n-player Bayesian game when the set of outcomes
is:
O = X × ℝn
for a set X, if the utility of an agent i given joint type θ can be written:
ui(o,θ) = ui(x,θ) - pi,
where o = (x,p) is an element of O, and ui: X × Θ ⇒ ℝ.
The corresponding game setting is called Quasilinear Setting.
A direct mechanism in a quasilinear setting (N, O = X × ℝn, Θ, p, u) is a pair (χ, ρ) specifying a
basic outcome χ(θ) and a profile of payments ρ(θ) = (p1(θ),…,pn(θ)).
Preferences have private values, or satisfy conditional utility independence, if each agent i’s utility function
does not depend on the other agents’ types, i.e it can be written as ui(o,θi).
An agent’s type becomes their valuation function: i’s value for choice x ∊ X is vi(x) = ui(x, θi).
Alternative definition of a direct mechanism with private values:
Direct Mechanism - a mechanism where the set of joint actions is equal to the set of joint types, i.e.
the agents have to declare their types to the mechanism.
Incentive Compatible (aka Truthful or Strategy-proof) Mechanism - a direct
mechanism where declaring true type for every agent is a weakly-dominant strategy Nash equilibrium. In other words, every
agent fare best or at least not worse by being truthful, regardless of what the others do.
A transferable utility mechanism is strictly Pareto efficient, or just efficient, if in equilibrium it selects
the choice that maximizes the sum of agents’ utilities, disregarding monetary payments.
A transferable utility mechanism is budget balanced when, regardless of the agents’ types, in equilibrium the
mechanism collects and disburses the same amount of money from and to the agents.
A transferable utility mechanism is individual rational when, in equilibrium no agent loses by
participating in the mechanism, i.e. the valuation subtracting the payment for every agent is not negative.
A mechanism is tractable when for every possible valuation profile the mechanism mapping function can be
computed in polynomial time.
A mechanism X is revenue maximizing when among the set of other mechanisms that satisfy the other constraints,
the mechanism X maximizes the total payments made by agents in equilibrium.
A mechanism is maxmin fair if it “makes the least-happy agent the happiest”.
Any social choice function that can be implemented by any mechanism can be implemented by a truthful, direct mechanism.
In mechanism design, the revelation principle is of utmost importance in finding solutions. The researcher need only
look at the set of equilibrium characterized by truthfulness. That is, if the mechanism designer wants to
implement some outcome or property, he can restrict his search to mechanisms in which agents are willing to reveal
their private information to the mechanism designer that has that outcome or property. If no such direct and truthful
mechanism exists, no mechanism can implement this outcome/property. By narrowing the area needed to be searched, the
problem of finding a mechanism becomes much easier.
For every social choice function, one of the following three things must hold:
Median voting in single-peaked domains is strategy-proof (any other statistics can be used instead of median
e.g. max or min)
Trade is strategy-proof:
The Vickrey-Clarke-Groves (VCG) mechanism is a general way for self-interested agents to choose a social-welfare maximizing outcome. It works
in quasilinear utility settings, i.e. where monetary payments are applied, although has some limitations (listed below).
Examples of where it can be used
Qualities
Under additional assumptions about the setting, can satisfy:
History
Groves mechanism - a direct transferable utility mechanism such that:
Vickrey-Clarke-Groves (VCG) mechanism (aka Pivotal mechanism) - a Groves mechanism such
that:
In the VCG mechanism:
Truth telling is a dominant strategy under any Groves mechanism including the pivotal mechanism (a VCG mechanism)
Suppose that for all agents any utility function is possible. Then a Pareto efficient mechanism has truthful
reporting as a dominant strategy for all agents and preferences only if it is Groves mechanism.
Privacy
VCG requires agents to fully reveal their private information. This private information may have value to agents
that extends beyond the current interaction. For example, the agents may know that they will compete with each
other again in the future.
Susceptibility to collusion
Agents can benefit by colluding. For example 2 agents can increase their valuation for an outcome,
which will decrease their payment.
VCG is not frugal
The gap between agent’s true cost and the payment they could receive under VCG is unbounded.
VCG can end up paying arbitrarily more than an agent is willing to accept (or equivalently charging
arbitrarily less than an agent is willing to pay).
Revenue Monotonicity Violated
Revenue always weakly increases as agents are added. An agent could pretend to be 2 agents and eliminate
his payment. (Sybil attack)
Cannot Return All Revenue to Agents
We might want to find some way of returning the mechanism’s profits back the agents.
However, the possibility of receiving a rebate after the mechanism has been run changes the agents’ incentives.
Theorem
The VCG mechanism is ex-post individual rational when the choice set monotonicity and no negative externalities
properties hold.
An environment exhibits choice-set monotonicity if for all agents the set of outcomes that are achievable
without that agent present is a weak subset of the set of outcomes that are possible when that agent is present.
An environment exhibits no negative externalitites if for all agents and all choices that can be made
without that agent, the agent’s own valuation for each of these choices is non-negative.
Theorem
The VCG mechanism is weakly budget-balanced when the no single-agent effect property holds.
No single-agent effect - if I drop an agent i and then I pick some other choice instead without i,
everybody other than i is at least as happy with the new choice as with the old choice.
Theorem (Krishna & Perry, 1998)
In any Bayesian game setting in which VCG is ex post individually rational, VCG collects at least as much revenue
as any other efficient and ex interim individually-rational mechanism.
A useful corollary: VCG is as budget balanced as any efficient mechanism can be: it satisfies weak budget
balance in every case where any dominant strategy, efficient and ex interim individually-rational
mechanism is able to.
https://github.com/medvedev1088/cryptoeconomics-cheat-sheet