Hits ?▲ |
Authors |
Title |
Venue |
Year |
Link |
Author keywords |
109 | Luca de Alfaro, Krishnendu Chatterjee, Marco Faella, Axel Legay |
Qualitative Logics and Equivalences for Probabilistic Systems. |
QEST |
2007 |
DBLP DOI BibTeX RDF |
|
83 | Sooraj Bhat, David L. Roberts 0001, Mark J. Nelson, Charles L. Isbell Jr., Michael Mateas |
A globally optimal algorithm for TTD-MDPs. |
AAMAS |
2007 |
DBLP DOI BibTeX RDF |
Markov decision processes, convex optimization, interactive entertainment |
83 | Kee-Eung Kim, Thomas L. Dean |
Solving Factored MDPs with Large Action Space Using Algebraic Decision Diagrams. |
PRICAI |
2002 |
DBLP DOI BibTeX RDF |
|
82 | Krishnendu Chatterjee |
Markov Decision Processes with Multiple Long-Run Average Objectives. |
FSTTCS |
2007 |
DBLP DOI BibTeX RDF |
|
69 | Dmitri A. Dolgov, Edmund H. Durfee |
Symmetric approximate linear programming for factored MDPs with application to constrained problems. |
Ann. Math. Artif. Intell. |
2006 |
DBLP DOI BibTeX RDF |
Mathematics Subject Classifications (2000) 60J22, 62C99, 90C90 |
68 | Hugo Gimbert, Wieslaw Zielonka |
Limits of Multi-Discounted Markov Decision Processes. |
LICS |
2007 |
DBLP DOI BibTeX RDF |
|
56 | Eric V. Denardo, Eugene A. Feinberg, Uriel G. Rothblum |
On occupation measures for total-reward MDPs. |
CDC |
2008 |
DBLP DOI BibTeX RDF |
|
56 | Dmitri A. Dolgov, Michael R. James 0001, Michael E. Samples |
Combinatorial resource scheduling for multiagent MDPs. |
AAMAS |
2007 |
DBLP DOI BibTeX RDF |
task and resource allocation in agent systems, multiagent planning |
52 | Jianhui Wu 0006, Edmund H. Durfee |
Automated resource-driven mission phasing techniques for constrained agents. |
AAMAS |
2005 |
DBLP DOI BibTeX RDF |
abstract MDPs, constrained MDPs, mission phasing, mixed integer programming |
42 | Thomas Gabel, Martin A. Riedmiller |
Evaluation of Batch-Mode Reinforcement Learning Methods for Solving DEC-MDPs with Changing Action Sets. |
EWRL |
2008 |
DBLP DOI BibTeX RDF |
|
42 | Dmitri A. Dolgov, Edmund H. Durfee |
Resource allocation among agents with preferences induced by factored MDPs. |
AAMAS |
2006 |
DBLP DOI BibTeX RDF |
(multi-)agent planning, task and resource allocation in agent systems |
42 | Alberto Reyes, Pablo H. Ibargüengoytia, Luis Enrique Sucar |
Power Plant Operator Assistant: An Industrial Application of Factored MDPs. |
MICAI |
2004 |
DBLP DOI BibTeX RDF |
|
41 | Mark Kroon, Shimon Whiteson |
Automatic Feature Selection for Model-Based Reinforcement Learning in Factored MDPs. |
ICMLA |
2009 |
DBLP DOI BibTeX RDF |
factored MDPs, feature selection, Reinforcement learning |
41 | Juan Frausto Solís, Elizabeth Santiago D., Jaime Mora-Vargas |
Cosine Policy Iteration for Solving Infinite-Horizon Markov Decision Processes. |
MICAI |
2009 |
DBLP DOI BibTeX RDF |
cosine simplex method, Markov decision processes, hybrid method, policy iteration |
41 | Tiffany Barnes, John C. Stamper |
Toward Automatic Hint Generation for Logic Proof Tutoring Using Historical Student Data. |
Intelligent Tutoring Systems |
2008 |
DBLP DOI BibTeX RDF |
|
41 | Carlos Diuk, Andre Cohen, Michael L. Littman |
An object-oriented representation for efficient reinforcement learning. |
ICML |
2008 |
DBLP DOI BibTeX RDF |
|
41 | Stefan J. Witwicki, Edmund H. Durfee |
Commitment-driven distributed joint policy search. |
AAMAS |
2007 |
DBLP DOI BibTeX RDF |
coordination, negotiation, agent modeling |
41 | Janusz Marecki, Milind Tambe |
On opportunistic techniques for solving decentralized Markov decision processes with temporal constraints. |
AAMAS |
2007 |
DBLP DOI BibTeX RDF |
decentralized Markov decision process, locally optimal solution, multi-agent systems, temporal constraints |
41 | Alberto Reyes, Luis Enrique Sucar, Eduardo F. Morales 0001, Pablo H. Ibargüengoytia |
Solving Hybrid Markov Decision Processes. |
MICAI |
2006 |
DBLP DOI BibTeX RDF |
|
41 | Shulin Cui, Jigui Sun, Minghao Yin, Shuai Lu 0001 |
Solving Uncertain Markov Decision Problems: An Interval-Based Method. |
ICNC (2) |
2006 |
DBLP DOI BibTeX RDF |
|
41 | Aurélie Beynier, Abdel-Illah Mouaddib |
A polynomial algorithm for decentralized Markov decision processes with temporal constraints. |
AAMAS |
2005 |
DBLP DOI BibTeX RDF |
multi-agent systems, uncertainty, planning, Markov decision processes |
41 | Raphen Becker, Shlomo Zilberstein, Victor R. Lesser |
Decentralized Markov Decision Processes with Event-Driven Interactions. |
AAMAS |
2004 |
DBLP DOI BibTeX RDF |
|
41 | Xi-Ren Cao |
From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning. |
Discret. Event Dyn. Syst. |
2003 |
DBLP DOI BibTeX RDF |
gradient-based policy iteration, perturbation realization, TD(), Q-learning, Poisson equations, Potentials |
41 | Raphen Becker, Shlomo Zilberstein, Victor R. Lesser, Claudia V. Goldman |
Transition-independent decentralized markov decision processes. |
AAMAS |
2003 |
DBLP DOI BibTeX RDF |
decentralized MDP, decision-theoretic planning |
40 | Calin Ciufudean, Otilia Ciufudean, Constantin Filote |
New Models for Immune Mechanism Diagnosis. |
MDA |
2008 |
DBLP DOI BibTeX RDF |
Markov Decision Processes (MDPs), Immune mechanisms diagnosis, Petri nets |
40 | Xi-Ren Cao |
Basic Ideas for Event-Based Optimization of Markov Systems. |
Discret. Event Dyn. Syst. |
2005 |
DBLP DOI BibTeX RDF |
Markov decision processes (MDPs), performance potentials, policy gradients, aggregation, perturbation analysis, POMDPs, policy iteration |
30 | Gellért Weisz, András György 0001, Csaba Szepesvári |
Online RL in Linearly qπ-Realizable MDPs Is as Easy as in Linear MDPs If You Learn What to Ignore. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
30 | Runyu Zhang, Yang Hu, Na Li 0002 |
Regularized Robust MDPs and Risk-Sensitive MDPs: Equivalence, Policy Gradient, and Sample Complexity. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
30 | Gellért Weisz, András György 0001, Csaba Szepesvári |
Online RL in Linearly qπ-Realizable MDPs Is as Easy as in Linear MDPs If You Learn What to Ignore. |
NeurIPS |
2023 |
DBLP BibTeX RDF |
|
30 | Eugene A. Feinberg, Jefferson Huang |
Reduction of total-cost and average-cost MDPs with weakly continuous transition probabilities to discounted MDPs. |
Oper. Res. Lett. |
2018 |
DBLP DOI BibTeX RDF |
|
30 | Kim Bauters, Weiru Liu, Lluís Godo |
Anytime Algorithms for Solving Possibilistic MDPs and Hybrid MDPs. |
FoIKS |
2016 |
DBLP DOI BibTeX RDF |
|
30 | Richard S. Sutton, Doina Precup, Satinder Singh 0001 |
Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning. |
Artif. Intell. |
1999 |
DBLP DOI BibTeX RDF |
|
28 | Moser Silva Fagundes, Roberto Centeno, Holger Billhardt, Sascha Ossowski |
Designing Organized Multiagent Systems through MDPs. |
MATES |
2009 |
DBLP DOI BibTeX RDF |
|
28 | Feng Wu 0001, Xiaoping Chen |
Solving Large-Scale and Sparse-Reward DEC-POMDPs with Correlation-MDPs. |
RoboCup |
2007 |
DBLP DOI BibTeX RDF |
|
28 | Song Zhiwei, Chen Xiaoping |
States evolution in Theta(lambda)-learning based on logical MDPs with negation. |
SMC |
2007 |
DBLP DOI BibTeX RDF |
|
28 | Sarah Osentoski, Sridhar Mahadevan |
Learning state-action basis functions for hierarchical MDPs. |
ICML |
2007 |
DBLP DOI BibTeX RDF |
|
28 | Jianhui Wu 0006, Edmund H. Durfee |
Mixed-integer linear programming for transition-independent decentralized MDPs. |
AAMAS |
2006 |
DBLP DOI BibTeX RDF |
transition-independent decentralized MDP, mixed integer linear programming, MDP, piecewise linear approximation |
28 | Gerardo I. Simari, Simon Parsons |
On the relationship between MDPs and the BDI architecture. |
AAMAS |
2006 |
DBLP DOI BibTeX RDF |
policy, markov decision process, intention |
28 | Dmitri A. Dolgov, Edmund H. Durfee |
Computationally-efficient combinatorial auctions for resource allocation in weakly-coupled MDPs. |
AAMAS |
2005 |
DBLP DOI BibTeX RDF |
distributed implementation, generalized Vickrey auctions, markov decision processes, combinatorial auctions |
28 | David I. Ferguson, Anthony Stentz |
Focussed Propagation of MDPs for Path Planning. |
ICTAI |
2004 |
DBLP DOI BibTeX RDF |
|
27 | Lihong Li 0001, Michael L. Littman, Christopher R. Mansley |
Online exploration in least-squares policy iteration. |
AAMAS (2) |
2009 |
DBLP BibTeX RDF |
PAC-MDP, least-squares policy iteration (LSPI), reinforcement learning, Markov decision processes, exploration |
27 | Yanjie Li, Baoqun Yin, Hongsheng Xi |
Partially Observable Markov Decision Processes and Performance Sensitivity Analysis. |
IEEE Trans. Syst. Man Cybern. Part B |
2008 |
DBLP DOI BibTeX RDF |
|
27 | Pritam Roy, David Parker 0001, Gethin Norman, Luca de Alfaro |
Symbolic Magnifying Lens Abstraction in Markov Decision Processes. |
QEST |
2008 |
DBLP DOI BibTeX RDF |
|
27 | Ronald Ortner |
Pseudometrics for State Aggregation in Average Reward Markov Decision Processes. |
ALT |
2007 |
DBLP DOI BibTeX RDF |
|
27 | Trevor Walker, Lisa Torrey, Jude W. Shavlik, Richard Maclin |
Building Relational World Models for Reinforcement Learning. |
ILP |
2007 |
DBLP DOI BibTeX RDF |
|
27 | Aaron Wilson, Alan Fern, Soumya Ray, Prasad Tadepalli |
Multi-task reinforcement learning: a hierarchical Bayesian approach. |
ICML |
2007 |
DBLP DOI BibTeX RDF |
|
27 | Jeffrey Johns, Sridhar Mahadevan |
Constructing basis functions from directed graphs for value function approximation. |
ICML |
2007 |
DBLP DOI BibTeX RDF |
|
27 | Jennifer Boger, Jesse Hoey, Pascal Poupart, Craig Boutilier, Geoff R. Fernie, Alex Mihailidis |
A Planning System Based on Markov Decision Processes to Guide People With Dementia Through Activities of Daily Living. |
IEEE Trans. Inf. Technol. Biomed. |
2006 |
DBLP DOI BibTeX RDF |
|
27 | Haibo Zhao, Prashant Doshi |
A Hierarchical Framework for Composing Nested Web Processes. |
ICSOC |
2006 |
DBLP DOI BibTeX RDF |
|
27 | Marta Z. Kwiatkowska, Gethin Norman, David Parker 0001 |
Game-based Abstraction for Markov Decision Processes. |
QEST |
2006 |
DBLP DOI BibTeX RDF |
|
27 | Nicole Immorlica, Kamal Jain, Mohammad Mahdian |
Game-Theoretic Aspects of Designing Hyperlink Structures. |
WINE |
2006 |
DBLP DOI BibTeX RDF |
|
27 | Krishnendu Chatterjee, Rupak Majumdar, Thomas A. Henzinger |
Markov Decision Processes with Multiple Objectives. |
STACS |
2006 |
DBLP DOI BibTeX RDF |
|
27 | Kristian Kersting, Luc De Raedt |
Logical Markov Decision Programs and the Convergence of Logical TD(lambda). |
ILP |
2004 |
DBLP DOI BibTeX RDF |
|
27 | Dmitri A. Dolgov, Edmund H. Durfee |
Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes. |
AAMAS |
2004 |
DBLP DOI BibTeX RDF |
|
27 | Fletcher Lu, Dale Schuurmans |
Model-Based Least-Squares Policy Evaluation. |
AI |
2003 |
DBLP DOI BibTeX RDF |
|
27 | Mohammad Ghavamzadeh, Sridhar Mahadevan |
A multiagent reinforcement learning algorithm by dynamically merging markov decision processes. |
AAMAS |
2002 |
DBLP DOI BibTeX RDF |
|
26 | Jianhui Wu 0006, Edmund H. Durfee |
Sequential resource allocation in multiagent systems with uncertainties. |
AAMAS |
2007 |
DBLP DOI BibTeX RDF |
constrained MDPs, mission phasing, sequential resource allocation, mixed integer linear programming |
26 | Thomas A. Wagner, Anita Raja, Victor R. Lesser |
Modeling Uncertainty and its Implications to Sophisticated Control in Tæms Agents. |
Auton. Agents Multi Agent Syst. |
2006 |
DBLP DOI BibTeX RDF |
Agent scheduling, Contingency analysis, Uncertainty, Intelligent agents, Control, MDPs |
26 | Jiaying Shen, Victor R. Lesser, Norman Carver |
Minimizing communication cost in a distributed Bayesian network using a decentralized MDP. |
AAMAS |
2003 |
DBLP DOI BibTeX RDF |
decentralized MDPs, Bayesian networks, action selection, decision-theoretic planning, coordination of multiple agents |
15 | Sivaramakrishnan Ramani, Archis Ghate |
A Family of \(\boldsymbol{s}\)-Rectangular Robust MDPs: Relative Conservativeness, Asymptotic Analyses, and Finite-Sample Properties. |
SIAM J. Optim. |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Long-Fei Li, Peng Zhao 0006, Zhi-Hua Zhou |
Improved Algorithm for Adversarial Linear Mixture MDPs with Bandit Feedback and Unknown Transition. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Toshinori Kitamura, Tadashi Kozuno, Masahiro Kato, Yuki Ichihara, Soichiro Nishimori, Akiyoshi Sannai, Sho Sonoda, Wataru Kumagai, Yutaka Matsuo |
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Berk Bozkurt, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang 0002 |
Model approximation in MDPs with unbounded per-step cost. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Adrian Müller, Pragnya Alatur, Volkan Cevher, Giorgia Ramponi, Niao He |
Truly No-Regret Learning in Constrained MDPs. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Michael Gimelfarb, Ayal Taitler, Scott Sanner |
Constraint-Generation Policy Optimization (CGPO): Nonlinear Programming for Policy Optimization in Mixed Discrete-Continuous MDPs. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Junze Deng, Yuan Cheng, Shaofeng Zou, Yingbin Liang |
Sample Complexity Characterization for Linear Contextual MDPs. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Davide Maran, Alberto Maria Metelli, Matteo Papini, Marcello Restelli |
No-Regret Reinforcement Learning in Smooth MDPs. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Francesco Emanuele Stradi, Matteo Castiglioni, Alberto Marchesi 0001, Nicola Gatti 0001 |
Learning Adversarial MDPs with Stochastic Hard Constraints. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Kazuki Watanabe 0003, Marck van der Vegt, Ichiro Hasuo, Jurriaan Rot, Sebastian Junges |
Pareto Curves for Compositionally Model Checking String Diagrams of MDPs. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Matthew Zurek, Yudong Chen |
Span-Based Optimal Sample Complexity for Weakly Communicating and General Average Reward MDPs. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Hei Yi Mak, Flint Xiaofeng Fan, Luca A. Lanzendörfer, Cheston Tan, Wei Tsang Ooi, Roger Wattenhofer |
CAESAR: Enhancing Federated RL in Heterogeneous MDPs through Convergence-Aware Sampling with Screening. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Menno van Zutphen, Giannis Delimpaltadakis, Maurice Heemels, Duarte Antunes |
Predictable Interval MDPs through Entropy Regularization. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Kihyuk Hong, Ambuj Tewari |
A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Low-Rank MDPs. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Qinbo Bai, Washim Uddin Mondal, Vaneet Aggarwal |
Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Omer Ben-Porat, Yishay Mansour, Michal Moshkovitz, Boaz Taitler |
Principal-Agent Reward Shaping in MDPs. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Prashansa Panda, Shalabh Bhatnagar |
Critic-Actor for Average Reward MDPs with Function Approximation: A Finite-Time Analysis. |
CoRR |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Ian A. Kash, Lev Reyzin, Zishun Yu |
Slowly Changing Adversarial Bandit Algorithms are Efficient for Discounted MDPs. |
ALT |
2024 |
DBLP BibTeX RDF |
|
15 | Yulong Gao, Karl Henrik Johansson, Alessandro Abate |
CTL Model Checking of MDPs over Distribution Spaces: Algorithms and Sampling-based Computations. |
HSCC |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Mateo Perez, Fabio Somenzi, Ashutosh Trivedi 0001 |
A PAC Learning Algorithm for LTL and Omega-Regular Objectives in MDPs. |
AAAI |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Long-Fei Li, Peng Zhao 0006, Zhi-Hua Zhou |
Dynamic Regret of Adversarial MDPs with Unknown Transition and Linear Function Approximation. |
AAAI |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Omer Ben-Porat, Yishay Mansour, Michal Moshkovitz, Boaz Taitler |
Principal-Agent Reward Shaping in MDPs. |
AAAI |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Uri Gadot, Esther Derman, Navdeep Kumar, Maxence Mohamed Elfatihi, Kfir Levy, Shie Mannor |
Solving Non-rectangular Reward-Robust MDPs via Frequency Regularization. |
AAAI |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Kazuki Watanabe 0003, Marck van der Vegt, Ichiro Hasuo, Jurriaan Rot, Sebastian Junges |
Pareto Curves for Compositionally Model Checking String Diagrams of MDPs. |
TACAS (2) |
2024 |
DBLP DOI BibTeX RDF |
|
15 | Junze Deng, Yuan Cheng, Shaofeng Zou, Yingbin Liang |
Sample Complexity Characterization for Linear Contextual MDPs. |
AISTATS |
2024 |
DBLP BibTeX RDF |
|
15 | Long-Fei Li, Peng Zhao, Zhi-Hua Zhou |
Improved Algorithm for Adversarial Linear Mixture MDPs with Bandit Feedback and Unknown Transition. |
AISTATS |
2024 |
DBLP BibTeX RDF |
|
15 | Miruna Oprescu, Andrew Bennett, Nathan Kallus |
Low-rank MDPs with Continuous Action Spaces. |
AISTATS |
2024 |
DBLP BibTeX RDF |
|
15 | Germano Gabbianelli, Gergely Neu, Matteo Papini, Nneka Okolo |
Offline Primal-Dual Reinforcement Learning for Linear MDPs. |
AISTATS |
2024 |
DBLP BibTeX RDF |
|
15 | Uday Kumar M, Veeraruna Kavitha, Sanjay P. Bhat, Nandyala Hemachandra |
Optimal Markov Policies for Finite-Horizon Constrained MDPs With Combined Additive and Multiplicative Utilities. |
IEEE Control. Syst. Lett. |
2023 |
DBLP DOI BibTeX RDF |
|
15 | Qinbo Bai, Vaneet Aggarwal, Ather Gattami |
Provably Sample-Efficient Model-Free Algorithm for MDPs with Peak Constraints. |
J. Mach. Learn. Res. |
2023 |
DBLP BibTeX RDF |
|
15 | Ali Devran Kara, Naci Saldi, Serdar Yüksel |
Q-Learning for MDPs with General Spaces: Convergence and Near Optimality via Quantization under Weak Continuity. |
J. Mach. Learn. Res. |
2023 |
DBLP BibTeX RDF |
|
15 | Shaorong Xie, Zhenyu Zhang 0013, Hang Yu 0006, Xiangfeng Luo |
Recurrent prediction model for partially observable MDPs. |
Inf. Sci. |
2023 |
DBLP DOI BibTeX RDF |
|
15 | Kosuke Sakamoto, Yasuharu Kunii |
A MDPs-Based Dynamic Path Planning in Unknown Environments for Hopping Locomotion. |
IEEE Access |
2023 |
DBLP DOI BibTeX RDF |
|
15 | Ravi N. Haksar, Mac Schwager |
Constrained Control of Large Graph-Based MDPs Under Measurement Uncertainty. |
IEEE Trans. Autom. Control. |
2023 |
DBLP DOI BibTeX RDF |
|
15 | Frantisek Blahoudek, Petr Novotný 0001, Melkior Ornik, Pranay Thangeda, Ufuk Topcu |
Efficient Strategy Synthesis for MDPs With Resource Constraints. |
IEEE Trans. Autom. Control. |
2023 |
DBLP DOI BibTeX RDF |
|
15 | S. Akshay 0001, Krishnendu Chatterjee, Tobias Meggendorfer, Dorde Zikelic |
MDPs as Distribution Transformers: Affine Invariant Synthesis for Safety Objectives. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
15 | Junkai Zhang, Weitong Zhang, Quanquan Gu |
Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
15 | Matthew Zurek, Yudong Chen |
Span-Based Optimal Sample Complexity for Average Reward MDPs. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
15 | Kasper Engelen, Guillermo A. Pérez 0001, Shrisha Rao 0002 |
Graph-Based Reductions for Parametric and Weighted MDPs. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
15 | Zakaria Mhammedi, Adam Block, Dylan J. Foster, Alexander Rakhlin |
Efficient Model-Free Exploration in Low-Rank MDPs. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|
15 | Ted Moskovitz, Brendan O'Donoghue, Vivek Veeriah, Sebastian Flennerhag, Satinder Singh 0001, Tom Zahavy |
ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs. |
CoRR |
2023 |
DBLP DOI BibTeX RDF |
|