Tour de Heuristics: Joint Equilibrium Search (JESP)

May-2014

The final stop on this heuristics tour, and the last stop for our overview of Cooperative Decision Making is Joint Equilibrium Search. This technique starts with some pre-set horizon T policies for each agent, and then cycles through each agent so that it may tweak its behaviors to maximize the response with all other policies held fixed. It continues this cycle until all trees have stabilized.

The Algorithm

def joint_equilibrium_search(policy):
    curr_policy = policy
    do:
        prev_policy = curr_policy
        for agent in get_all_agents() :
            calculate_expected_values(curr_policy)
            curr_policy[agent] = get_best_response(curr_policy, agent)
    while prev_policy != curr_policy
    return curr_policy

Well, That’s Nifty

It seems like a slam dunk, but there are a few gotchas. From an optimality standpoint, it is only locally optimal. In order to try and mitigate this, there needs to be some sort of algorithm before this that helps select somewhat optimal trees before coming in.

Start the Discussion!Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.