Back in college, I learned about a tool called the “Bellman Equation”. It’s very nice because it turns into a local calculation for each node, and you only need to know about your neighbors’ previous values. It’s parallelizable. (Do every node in parallel, sync, repeat, until convergence). The only gotcha for using a bellman equation […]

# Year: 2014

The final stop on this heuristics tour, and the last stop for our overview of Cooperative Decision Making is Joint Equilibrium Search. This technique starts with some pre-set horizon T policies for each agent, and then cycles through each agent so that it may tweak its behaviors to maximize the response with all other policies […]

Memory bounded dynamic programming is another technique offered in Cooperative Decision Making. This is the first sub-optimal heuristic that is brought up. It takes the same techniques as seen before with an exhaustive backup, but at each stage, only a specific number of trees remain at the end of these operations. Due to this, the […]

## Tour de Heuristics: Policy Iteration

Policy Iteration is the most available option for dealing with infinite horizon DEC-POMDP’s. In this space, it is sub-optimal. It can be, however, epsilon-optimal. Epsilon optimality means that based on the starting point and a decay factor, we can plan a controller out for enough steps that the expected discounted reward for any more steps […]

## Tour de Heuristics: MAA*

Multiagent A* is a heuristic that takes the commonly used A* algorithm and applies it to Dec-POMDP’s. Let’s investigate how it works. The Algorithm def estimated_state_value(belief, action): """ The cornerstone of the A* algorithm is to have an optimistic estimator. Because an MDP assumes more information, it will always have at least as much value […]

## Multi-Agent Systems: Finite Horizons

In our previous post, we covered the basics of what a Dec-POMDP is. Let’s actually look at what a policy is and how we can generate one. The Example In this example, there are two robots that are trying to meet anywhere on the map. They want to do this optimally. Unfortunately, they don’t have […]

This is the first post in a study on all things using Decentralized Partially Observable Markov Decision Processes (Dec-POMDP) with my professor, Prithviraj Dasgupta, who runs the CMantic Lab at the University of Nebraska, Omaha. I intend to write the summaries of what I find as blog posts, so be prepared to go on a […]

## Nightmare before Christmas Code

A common thing seen in the enterprise world is the “implementation” of new technology using the same methods as old technology. It is often done by doing things “the old way” with the new tech. One example “santa tech” that has been recently abused is the term “REST”. Many people go in and rewrite SOAP […]