Approximate policy iteration:a survey and somenew methods

在线阅读 下载PDF 导出详情
摘要 Weconsidertheclassicalpolicyiterationmethodofdynamicprogramming(DP),whereapproximationsandsimulationareusedtodealwiththecurseofdimensionality.Wesurveyanumberofissues:convergenceandrateofconvergenceofapproximatepolicyevaluationmethods,singularityandsusceptibilitytosimulationnoiseofpolicyevaluation,explorationissues,constrainedandenhancedpolicyiteration,policyoscillationandchattering,andoptimisticanddistributedpolicyiteration.Ourdiscussionofpolicyeva...
机构地区 不详
出版日期 2011年03月13日(中国期刊网平台首次上网日期,不代表论文的发表时间)