Abstract
In this paper we propose a new acceleration method for solving Markov decision processes. Value iteration is a classical algorithm for solving Markov decision processes, but this algorithm and its variants are quite slow for discount factors close to one and their convergence properties depend to a great extent on a good state update order. Recently, it has been shown that improved topological value iteration presents a good convergence speed thanks to the use of an improved topological ordering algorithm. Nevertheless, the drawback of this algorithm is due to its memory requirements. So, we present a different method to obtain a good state backup order with less memory requirements. Experimental results obtained on a stochastic shortest path problem are presented.