The UK Multi-Agent Systems Symposium

Date: 2020-06-25

On February 24th, 2020, I had the opportunity to visit the Alan Turing Institute in the British Library, London, for their UK Multi-Agent Systems Symposium. This event aimed to gather and connect world-leading UK-based research labs from academia and industry with a substantial focus on multi-agent systems to discuss the current landscape and future of this field. The virtual map below provides information about the various research labs which were represented at the event (it will be continually updated):

Introduction to Multi-Agent Systems

Dr. Stefano Albrecht opened the symposium by introducing multi-agent systems (MAS). Such systems are concerned with decision-making tasks where multiple agents act in a shared environment. Agents can observe their environment (partially or fully), act to impact the environment and might have different or aligned goals. Such research becomes especially interesting whenever substantial interaction, in the form of cooperation or competition, is required by agents. Interaction between agents can take on a variety of forms as seen by its vast array of possible applications, such as:

Negotiations: agents are constantly negotiating in automated auctions and trading for online advertisements and other financial applications.
Robot interactions: robots are applied in caring environments to interact with humans and large robotic warehouses as employed by Amazon and Ocado.
Autonomous vehicles: automating huge networks of car traffic through autonomous vehicles requires multi-agent solutions.

MAS research lies at the intersection of multiple research fields spanning game theory, negotiations, modelling, verification, to artificial intelligence with multi-agent planning and learning besides many more. The symposium gathered the UK research community surrounding these fields to discuss the current state and future of MAS.

The History of Multi-Agent Systems

After this brief introduction, Prof. Michael Wooldridge outlined the long history of MAS since their inception in the 1980s. While prior influences trace back as early as 1969, the first workshop on distributed artificial intelligence, considered the beginning of the MAS research community, was held in June 1980 at Endicott House, Massachusetts Institute of Technology [1]. Early research was mostly concerned with distributed problem solving and therefore means of distribution such as message passing and sharing of information in general. Two of the major influences on MAS at the time were game theory [2] and rational decision-making as a form of practical reasoning seen in systems such as Shakey [3]. Its notion of deliberation (what to achieve?) and means-ends reasoning (how to achieve it?) raised the question of rationality. In the context of MAS in particular, these ideas manifested in the formalization of communication actions between agents.

Infancy (1969 - 1980)
Adolescence (1980 - 1994)
Prime (1994 - 2010)
Mid-life crisis (2010 - 2020)

With the establishment of major conferences and journals in 1994, the field and term MAS took off with many new ideas and influences driving the research forward.

A major area throughout these years were negotiations [4] as a process of reaching an agreement on a matter of common interest. Research on automated negotiations led to the discovery of various auction protocols such as the Vickrey-Clarke-Groves (VCG) [5] mechanism, nowadays widely applied in online advertisement auctions. Multi-agent negotiation systems are also applied in high-frequency trading applications. Next to negotiations, voting theory, security games and rational verification impacted the field of multi-agent systems and raised another pool of applications for this research.

"The future is multi-agent - it has to be - [...] but the task today is to figure out what that means." - Prof. Michael Wooldridge

Prof. Wooldridge argued that around 2010 the field of MAS experienced a "mid-life crisis" feeling that it had not reached the same impact and breakthrough success such as machine learning. He also argued, that the field in fact did and continues to have an impact in various fields as outlined above. A resurgence is on the horizon, but the research community around MAS has to recognize and address its applications.

The Future of Multi-Agent Systems

Considering the rich history and vast array of applications of MAS, the consensus seemed to be that MAS have a role to play in future AI research. However, the majority of researchers who attended the symposium agreed that there are discussions to be had about future directions of the field which were openly debated at the end of the event.

One prominent thought was the lack of unique applications of MAS research. Why and when is multi-agent important? It is clear that many applications, some of which have been mentioned above, require solutions considering their multi-agent nature. For instance, negotiations and trading naturally involve multiple parties. However, I believe the lack of identity is not merely an issue with respect to the question the field is trying to answer, but rather its methodology. Many multi-agent research publications are combining methods from related fields such as game theory and, recently more prominent, machine learning (ML) (especially deep learning) for successful MAS applications. It appears to me, that the field is aware of its applications and potential, but struggles to find its selling point, its "killer app" to say it in the words of Prof. Wooldridge.

The variety of fields impacting MAS also raised the question of how upcoming MAS researchers should be taught. Is it essential to teach the fundamentals of game theory, modelling, and practical reasoning to understand the roots of the field? Should we focus on ML and deep learning due to its prominence in modern MAS applications? It seemed that researchers agreed that it is essential for new talent to know the fundamentals. However, they also recognized that ML is part of modern MAS research and (also because of its popularity among students) should be taught.

The lack of clear separation to fields such as ML is by no means a phenomenon only found in MAS research. ML becomes increasingly prominent in a large corpus of research under the broad term artificial intelligence, such as image recognition, classification, recommender systems, natural language processing, and many more. I don't believe this lack of separation harms MAS research, but rather should be seen as a chance to level-up and scale to bigger, previously intractable problems. We should embrace the capabilities modern ML brings while reminding ourselves of the initial question which led to MAS research to begin with.

Multi-Agent Systems Research in Industry

The close connection of ML and MAS might also lead to breakthrough industry impact, something that has previously been lacking in the field. Industry research labs already demonstrated over the past few years, that ML at large scale can truly have astonishing impact, AlphaGo and OpenAI Five just to name two. I was excited to see multiple UK-based research labs from industry present at the symposium.

Google DeepMind

Google DeepMind is a large UK-based research company aiming to “solve intelligence” and achieve Artificial General Intelligence (AGI). Dr. Edward Hughes and Dr. Yoram Bachrach from DeepMind presented some of their research. Their work focuses on deep reinforcement learning to allow agents to solve complex problems in a variety of challenges. In particular, they are looking at MAS as the world constantly requires interaction between multiple agents, and are inspired by human intelligence which has its origins in co-evolution and interaction.

Multi-agent environment of capture-the-flag (Image by DeepMind)

Their capture-the-flag project [6] resolves around a multiplayer game in a Quake arena environment in which two teams, each consisting of two agents, are competing against each other. Agents have to pick up flags of the opponent team while defending their own flag against capture. The environment requires agents to learn through population-based training, intrinsic reward shaping and procedural generation of environments for generalization. Especially, population-based training is an essential component in various successful DeepMind projects enabling agents to play StarCraft and tune hyperparameters for neural networks optimization. The combination of these approaches allowed for intelligent, cooperative behaviour to emerge throughout training. Further challenges the lab is working on are emergent communication and alliances as well as sequential social dilemma in MAS such as the tragedy of the commons [7].

Microsoft Research Cambridge

Dr. Sam Devlin, senior researcher at the Game Intelligence lab of Microsoft Research Cambridge, presented their research on video games aiming to improve gaming as an entertainment medium. This can be achieved by enabling game developers and by improving the overall experience of players, e.g. through intelligent autonomous agents to play alongside human players or by providing a fair ranking and matchmaking system for competitive online gaming. Latter was achieved with MAS in their matchmaking and skill ranking systems TrueMatch and TrueSkill. Malmo [8] is their project on MAS in the gaming environment of Minecraft. The research group organized a multi-agent reinforcement learning (MARL) competition inside Malmo and continue to explore the domain for engaging research.

The Malmo project based on Minecraft (Image by Microsoft Research)

One goal of their research is to allow AI companions to generalize to various scenarios for more "human-like" AI agents to collaborate with or compete against human players. One challenge in this is ad-hoc teamwork [9], i.e. agents have to generalize over tasks and other agents such that they can effectively work together with a set of potential teammates. To generalize across a large variety of tasks and conditions, often population-based concepts and procedurally generated scenarios are used to train the agents. While these approaches work well in practise, they are very computationally expensive, so simplified, efficient training schemes become desirable [10]. Motivated by the observation that simpler policies often generalize better, an information bottleneck was proposed to learn compressed features in the environment. Additionally, noise injection in experience collection with randomized policy leads to poor generalization, which could be avoided by only injecting noise in updates to stabilize training [11].

FiveAI

Dr. Subramanian Ramamoorthy, VP of Prediction and Planning at the UK-based autonomous vehicles startup FiveAI, presented the company's approach to safe decision-making for autonomous driving in urban environments. This task is extremely challenging due to the large variety of possible situations and required behaviours. Throughout all possible scenarios, cars need to be able to predict behaviour of other agents (pedestrians, cyclists, drivers), incorporate rules enforced in traffic, plan ahead and quantify risk.

A recent publication of the company [12] proposes to use planning for reliable decision-making considering high-level goals of other agents. Inverse planning is used to predict the other agents' behaviours giving their predicted goals. However, Dr. Ramamoorthy also emphasizes that ML in its current form can not be sufficiently verified for autonomous cars to entirely rely on it. Guarantees from other approaches need to be combined with effective ML to build a technology stack for safe autonomous driving. Dr. Ramamoorthy states that uncertainty is key for risk-aware planning in such environments, but it is still an open research question how uncertainty can be computed and considered at this scale.

Others

J.P. Morgan AI Research Group
Dr. Tucker Balch, research director in J.P. Morgan's AI research group, presented MAS for market simulation and analysis besides research into MAS for explainability. They also organize a conference for AI in finance in New York.
Fetch.AI
Fetch.AI, represented by Dr. David Galindo, is using MAS for finance applications such as MAS-optimised distributed ledgers and blockchain products.
Huawei Reinforcement Learning Lab
Yaodong Yang from the Huawei reinforcement learning lab in London presented their DriveML autonomous vehicles challenge and work to facilitate future autonomous cars.

Conclusion

The attendants of the symposium were hopeful with respect to the future of MAS at the end of the event. Ongoing research in industry and research labs highlights the importance and relevance of the field, despite a potential lack in identity. Generally, the event was a great opportunity to get a grasp of this wide field, get to know many bright minds around MAS, and attend lively and constructive discussions around possible directions.

I want to thank the organizers Dr. Stefano Albrecht, Prof. Michael Wooldridge, Kate Wicks, Lisa Harper, and Serena Lambley for this fantastic event. My gratitude especially goes to my supervisor Dr. Stefano Albrecht who made my attendance at this event possible.

References

Davis, Randall. Report on the Workshop on Distributed AI. 1980.
Rosenschein, Jeffrey S., and Michael R. Genesereth. Deals among Rational Agents. Readings in Distributed Artificial Intelligence. Morgan Kaufmann, 1988. 227-234.
Nilsson, Nils J. Shakey the Robot. Sri International Menlo Park CA, 1984.
Rosenschein, Jeffrey S., and Gilad Zlotkin. Rules of Encounter: Designing Conventions for Automated Negotiation among Computers. MIT press, 1994.
Groves, Theodore. Incentives in Teams. Econometrica: Journal of the Econometric Society, 1973. 617-631.
Jaderberg, M., Czarnecki, W. M., Dunning, I., Marris, L., Lever, G., Castaneda, A. G., ... & Sonnerat, N. Human-level performance in 3D multiplayer games with population-based reinforcement learning. Science, 2019. 364(6443), 859-865.
Hardin, Garrett. The tragedy of the commons. Journal of Natural Resources Policy Research 1.3, 2009. 243-253.
Johnson, M., Hofmann, K., Hutton, T., & Bignell, D. The Malmo Platform for Artificial Intelligence Experimentation. IJCAI. 2016.
Stone, P., Kaminka, G. A., Kraus, S., & Rosenschein, J. S. Ad hoc autonomous agent teams: Collaboration without pre-coordination. Twenty-Fourth AAAI Conference on Artificial Intelligence. 2010.
Harries, L., Lee, S., Rzepecki, J., Hofmann, K., & Devlin, S. MazeExplorer: A Customisable 3D Benchmark for Assessing Generalisation in Reinforcement Learning. 2019 IEEE Conference on Games (CoG). IEEE, 2019.
Igl, M., Ciosek, K., Li, Y., Tschiatschek, S., Zhang, C., Devlin, S., & Hofmann, K. Generalization in reinforcement learning with selective noise injection and information bottleneck. Advances in Neural Information Processing Systems. 2019.
Albrecht, S. V., Brewitt, C., Wilhelm, J., Eiras, F., Dobre, M., & Ramamoorthy, S. Integrating Planning and Interpretable Goal Recognition for Autonomous Driving. arXiv preprint arXiv:2002.02277, 2020.