Meta-learning within Projective Simulation

Adi Makmal, Alexey A. Melnikov, Vedran Dunjko, and Hans J. Briegel
(Dated: February 26, 2016)

Learning models of arti cial intelligence can nowadays perform very well on a large variety of tasks.
However, in practice different task environments are best handled by di erent learning models, rather
than a single, universal, approach. Most non-trivial models thus require the adjustment of several
to many learning parameters, which is often done on a case-by-case basis by an external party.
Meta-learning refers to the ability of an agent to autonomously and dynamically adjust its own
learning parameters, or meta-parameters. In this work we show how projective simulation, a recently
developed model of arti cial intelligence, can naturally be extended to account for meta-learning
in reinforcement learning settings. The projective simulation approach is based on a random walk
process over a network of clips. The suggested meta-learning scheme builds upon the same design
and employs clip networks to monitor the agent's performance and to adjust its meta-parameters
\on the  y". We distinguish between \re exive adaptation" and \adaptation through learning", and
show the utility of both approaches. In addition, a trade-o between exibility and learning-time is
addressed. The extended model is examined on three di erent kinds of reinforcement learning tasks,
in which the agent has di erent optimal values of the meta-parameters, and is shown to perform
well, reaching near-optimal to optimal success rates in all of them, without ever needing to manually
adjust any meta-parameter.

Resource Type: