简介:Thispaperpresentsamodel-basedapproximateλ-policyiterationapproachusingtemporaldifferencesforoptimizingpathsonlineforapursuit-evasionproblem,whereanagentmustvisitseveraltargetpositionswithinaregionofinterestwhilesimultaneouslyavoidingoneormoreactivelypursuingadversaries.Thismethodisrelevanttoapplications,suchasroboticpathplanning,mobile-sensorapplications,andpathexposure.Themethodologydescribedutilizescelldecompositiontoconstructadecisiontreeand...