Publications
Click on author names or tags to filter publications.
Selected filter tags (click to remove): Georgios-Papoudakis
2021
Georgios Papoudakis, Filippos Christianos, Lukas Schäfer, Stefano V. Albrecht
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks
Conference on Neural Information Processing Systems, Datasets and Benchmarks Track (NeurIPS), 2021
Abstract | BibTex | arXiv | Code
NeurIPSdeep-rlmulti-agent-rl
Abstract:
Multi-agent deep reinforcement learning (MARL) suffers from a lack of commonly-used evaluation tasks and criteria, making comparisons between approaches difficult. In this work, we consistently evaluate and compare three different classes of MARL algorithms (independent learning, centralised multi-agent policy gradient, value decomposition) in a diverse range of cooperative multi-agent learning tasks. Our experiments serve as a reference for the expected performance of algorithms across different learning tasks, and we provide insights regarding the effectiveness of different learning approaches. We open-source EPyMARL, which extends the PyMARL codebase [Samvelyan et al., 2019] to include additional algorithms and allow for flexible configuration of algorithm implementation details such as parameter sharing. Finally, we open-source two environments for multi-agent research which focus on coordination under sparse rewards.
@inproceedings{papoudakis2021benchmarking,
title={Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks},
author={Georgios Papoudakis and Filippos Christianos and Lukas Sch\"afer and Stefano V. Albrecht},
booktitle = {Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS)},
year={2021},
url = {http://arxiv.org/abs/2006.07869},
openreview = {https://openreview.net/forum?id=cIrPX-Sn5n},
code = {https://github.com/uoe-agents/epymarl}
}
Georgios Papoudakis, Filippos Christianos, Stefano V. Albrecht
Agent Modelling under Partial Observability for Deep Reinforcement Learning
Conference on Neural Information Processing Systems (NeurIPS), 2021
Abstract | BibTex | arXiv | Code
NeurIPSdeep-rlagent-modelling
Abstract:
Modelling the behaviours of other agents is essential for understanding how agents interact and making effective decisions. Existing methods for agent modelling commonly assume knowledge of the local observations and chosen actions of the modelled agents during execution. To eliminate this assumption, we extract representations from the local information of the controlled agent using encoder-decoder architectures. Using the observations and actions of the modelled agents during training, our models learn to extract representations about the modelled agents conditioned only on the local observations of the controlled agent. The representations are used to augment the controlled agent's decision policy which is trained via deep reinforcement learning; thus, during execution, the policy does not require access to other agents' information. We provide a comprehensive evaluation and ablations studies in cooperative, competitive and mixed multi-agent environments, showing that our method achieves significantly higher returns than baseline methods which do not use the learned representations.
@inproceedings{papoudakis2021local,
title={Agent Modelling under Partial Observability for Deep Reinforcement Learning},
author={Georgios Papoudakis and Filippos Christianos and Stefano V. Albrecht},
booktitle = {Proceedings of the Neural Information Processing Systems (NeurIPS)},
year = {2021}
}
Filippos Christianos, Georgios Papoudakis, Arrasy Rahman, Stefano V. Albrecht
Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing
International Conference on Machine Learning (ICML), 2021
Abstract | BibTex | arXiv | Video | Code
ICMLdeep-rlmulti-agent-rl
Abstract:
Sharing parameters in multi-agent deep reinforcement learning has played an essential role in allowing algorithms to scale to a large number of agents. Parameter sharing between agents significantly decreases the number of trainable parameters, shortening training times to tractable levels, and has been linked to more efficient learning. However, having all agents share the same parameters can also have a detrimental effect on learning. We demonstrate the impact of parameter sharing methods on training speed and converged returns, establishing that when applied indiscriminately, their effectiveness is highly dependent on the environment. We propose a novel method to automatically identify agents which may benefit from sharing parameters by partitioning them based on their abilities and goals. Our approach combines the increased sample efficiency of parameter sharing with the representational capacity of multiple independent networks to reduce training time and increase final returns.
@inproceedings{christianos2021scaling,
title={Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing},
author={Filippos Christianos and Georgios Papoudakis and Arrasy Rahman and Stefano V. Albrecht},
booktitle={International Conference on Machine Learning (ICML)},
year={2021}
}
2020
Georgios Papoudakis, Stefano V. Albrecht
Variational Autoencoders for Opponent Modeling in Multi-Agent Systems
AAAI Workshop on Reinforcement Learning in Games (AAAI), 2020
Abstract | BibTex | arXiv
AAAIdeep-rlagent-modelling
Abstract:
Multi-agent systems exhibit complex behaviors that emanate from the interactions of multiple agents in a shared environment. In this work, we are interested in controlling one agent in a multi-agent system and successfully learn to interact with the other agents that have fixed policies. Modeling the behavior of other agents (opponents) is essential in understanding the interactions of the agents in the system. By taking advantage of recent advances in unsupervised learning, we propose modeling opponents using variational autoencoders. Additionally, many existing methods in the literature assume that the opponent models have access to opponent's observations and actions during both training and execution. To eliminate this assumption, we propose a modification that attempts to identify the underlying opponent model using only local information of our agent, such as its observations, actions, and rewards. The experiments indicate that our opponent modeling methods achieve equal or greater episodic returns in reinforcement learning tasks against another modeling method.
@inproceedings{papoudakis2020variational,
title={Variational Autoencoders for Opponent Modeling in Multi-Agent Systems},
author={Georgios Papoudakis and Stefano V. Albrecht},
booktitle={AAAI Workshop on Reinforcement Learning in Games},
year={2020}
}
Georgios Papoudakis, Filippos Christianos , Lukas Schäfer, Stefano V. Albrecht
Comparative Evaluation of Multi-Agent Deep Reinforcement Learning Algorithms
arXiv:2006.07869, 2020
Abstract | BibTex | arXiv
deep-rlmulti-agent-rl
Abstract:
Multi-agent deep reinforcement learning (MARL) suffers from a lack of commonly-used evaluation tasks and criteria, making comparisons between approaches difficult. In this work, we evaluate and compare three different classes of MARL algorithms (independent learners, centralised training with decentralised execution, and value decomposition) in a diverse range of multi-agent learning tasks. Our results show that (1) algorithm performance depends strongly on environment properties and no algorithm learns efficiently across all learning tasks; (2) independent learners often achieve equal or better performance than more complex algorithms; (3) tested algorithms struggle to solve multi-agent tasks with sparse rewards. We report detailed empirical data, including a reliability analysis, and provide insights into the limitations of the tested algorithms.
@misc{papoudakis2020comparative,
title={Comparative Evaluation of Multi-Agent Deep Reinforcement Learning Algorithms},
author={Georgios Papoudakis and Filippos Christianos and Lukas Sch\"afer and Stefano V. Albrecht},
year={2020},
eprint={2006.07869},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Georgios Papoudakis, Filippos Christianos, Stefano V. Albrecht
Local Information Opponent Modelling Using Variational Autoencoders
arXiv:2006.09447, 2020
Abstract | BibTex | arXiv
deep-rlagent-modelling
Abstract:
Modelling the behaviours of other agents (opponents) is essential for understanding how agents interact and making effective decisions. Existing methods for opponent modelling commonly assume knowledge of the local observations and chosen actions of the modelled opponents, which can significantly limit their applicability. We propose a new modelling technique based on variational autoencoders, which are trained to reconstruct the local actions and observations of the opponent based on embeddings which depend only on the local observations of the modelling agent (its observed world state, chosen actions, and received rewards). The embeddings are used to augment the modelling agent's decision policy which is trained via deep reinforcement learning; thus the policy does not require access to opponent observations. We provide a comprehensive evaluation and ablation study in diverse multi-agent tasks, showing that our method achieves comparable performance to an ideal baseline which has full access to opponent's information, and significantly higher returns than a baseline method which does not use the learned embeddings.
@misc{papoudakis2020opponent,
title={Local Information Opponent Modelling Using Variational Autoencoders},
author={Georgios Papoudakis and Filippos Christianos and Stefano V. Albrecht},
year={2020},
eprint={2006.09447},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
2019
Georgios Papoudakis, Filippos Christianos, Arrasy Rahman, Stefano V. Albrecht
Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning
arXiv:1906.04737, 2019
Abstract | BibTex | arXiv
surveydeep-rlmulti-agent-rl
Abstract:
Recent developments in deep reinforcement learning are concerned with creating decision-making agents which can perform well in various complex domains. A particular approach which has received increasing attention is multi-agent reinforcement learning, in which multiple agents learn concurrently to coordinate their actions. In such multi-agent environments, additional learning problems arise due to the continually changing decision-making policies of agents. This paper surveys recent works that address the non-stationarity problem in multi-agent deep reinforcement learning. The surveyed methods range from modifications in the training procedure, such as centralized training, to learning representations of the opponent's policy, meta-learning, communication, and decentralized learning. The survey concludes with a list of open problems and possible lines of future research.
@misc{papoudakis2019dealing,
title={Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning},
author={Georgios Papoudakis and Filippos Christianos and Arrasy Rahman and Stefano V. Albrecht},
year={2019},
eprint={1906.04737},
archivePrefix={arXiv},
primaryClass={cs.LG}
}