Access Restriction

Author Ribeiro, Richardson ♦ Borges, André P. ♦ Koerich, Ro L. ♦ Scalabrin, Edson E. ♦ Enembreck, Fabrício
Source CiteSeerX
Content type Text
File Format PDF
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Abstract Abstract 1- In this paper we propose a novel strategy for converging dynamic policies generated by adaptive agents, which receive and accumulate rewards for their actions. The goal of the proposed strategy is to speed up the convergence of such agents to a good policy in dynamic environments. Since it is difficult to have the good value for a state due to the continuous changing in the environment, previous policies are kept in memory for reuse in future policies, avoiding delays or unexpected speedups in the agent’s learning. Experimental results on dynamic environments with different policies have shown that the proposed strategy is able to speed up the convergence of the agent while achieving good action policies.
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research
Education Level UG and PG ♦ Career/Technical Study
Learning Resource Type Article