Federated Reinforcement Distillation with Proxy Experience MemoryJul 15 2019In distributed reinforcement learning, it is common to exchange the experience memory of each agent and thereby collectively train their local models. The experience memory, however, contains all the preceding state observations and their corresponding ... More

NH-TTC: A gradient-based framework for generalized anticipatory collision avoidanceJul 12 2019We propose NH-TTC, a general method for fast, anticipatory collision avoidance for autonomous robots having arbitrary equations of motions. Our proposed approach exploits implicit differentiation and subgradient descent to locally optimize the non-convex ... More

From Observability to Significance in Distributed Information SystemsJul 12 2019To understand and explain process behaviour we need to be able to see it, and decide its significance, i.e. be able to tell a story about its behaviours. This paper describes a few of the modelling challenges that underlie monitoring and observation of ... More

Rethink Global Reward Game and Credit Assignment in Multi-agent Reinforcement LearningJul 11 2019Cooperative game is a critical research area in multi-agent reinforcement learning (MARL). Global reward game is a subclass of cooperative games, where all agents aim to maximize cumulative global rewards. Credit assignment is an important problem studied ... More

A Neural Architecture for Designing Truthful and Efficient AuctionsJul 11 2019Auctions are protocols to allocate goods to buyers who have preferences over them, and collect payments in return. Economists have invested significant effort in designing auction rules that result in allocations of the goods that are desirable for the ... More

Optimal mechanisms with budget for user generated contentsJul 10 2019In this paper, we design gross product maximization mechanisms which incentivize users to upload high-quality contents on user-generated-content (UGC) websites. We show that, the proportional division mechanism, which is widely used in practice, can perform ... More

An Empirical Study on the Practical Impact of Prior Beliefs over Policy TypesJul 10 2019Many multiagent applications require an agent to learn quickly how to interact with previously unknown other agents. To address this problem, researchers have studied learning algorithms which compute posterior beliefs over a hypothesised set of policies, ... More

Informative Path Planning with Local Penalization for Decentralized and Asynchronous Swarm Robotic SearchJul 09 2019Decentralized swarm robotic solutions to searching for targets that emit a spatially varying signal promise task parallelism, time efficiency, and fault tolerance. It is, however, challenging for swarm algorithms to offer scalability and efficiency, while ... More

Decentralized Dynamic Task Allocation in Swarm Robotic Systems for Disaster ResponseJul 09 2019Multiple robotic systems, working together, can provide important solutions to different real-world applications (e.g., disaster response), among which task allocation problems feature prominently. Very few existing decentralized multi-robotic task allocation ... More

Vertex-weighted Online Stochastic Matching with Patience ConstraintsJul 09 2019Online Bipartite Matching is a classic problem introduced by Karp, Vazirani, and Vazirani (Proc. ACM STOC, 1990) and motivated by applications such as e-commerce, online advertising, and ride-sharing. We wish to match a set of online vertices (e.g., webpage ... More

Appliance-level Flexible Scheduling for Socio-technical Smart Grid OptimisationJul 08 2019Participation in energy demand response programs requires an active role by users of residential appliances: they contribute flexibility in appliance usage as the means to adjust energy consumption and improve Smart Grid reliability. Understanding the ... More

A Communication-Efficient Multi-Agent Actor-Critic Algorithm for Distributed Reinforcement LearningJul 06 2019This paper considers a distributed reinforcement learning problem in which a network of multiple agents aim to cooperatively maximize the globally averaged return through communication with only local neighbors. A randomized communication-efficient multi-agent ... More

Sensing Volume Coverage of Robot Workspace using On-Robot Time-of-Flight Sensor Arrays for Safe Human Robot InteractionJul 03 2019In this paper, an analysis of the sensing volume coverage of robot workspace as well as the shared human-robot collaborative workspace for various configurations of on-robot Time-of-Flight (ToF) sensor array rings is presented. A methodology for volumetry ... More

Perspective Taking in Deep Reinforcement Learning AgentsJul 03 2019Perspective taking is the ability to take the point of view of another agent. This skill is not unique to humans as it is also displayed by other animals like chimpanzees. It is an essential ability for efficient social interactions, including cooperation, ... More

Distributed Learning in Non-Convex Environments -- Part II: Polynomial Escape from Saddle-PointsJul 03 2019The diffusion strategy for distributed learning from streaming data employs local stochastic gradient updates along with exchange of iterates over neighborhoods. In Part I [2] of this work we established that agents cluster around a network centroid and ... More

Distributed Learning in Non-Convex Environments -- Part I: Agreement at a Linear RateJul 03 2019Driven by the need to solve increasingly complex optimization problems in signal processing and machine learning, there has been increasing interest in understanding the behavior of gradient-descent algorithms in non-convex environments. Most available ... More

Koalja: from Data Plumbing to Smart Workspaces in the Extended CloudJul 03 2019Koalja describes a generalized data wiring or `pipeline' platform, built on top of Kubernetes, for plugin user code. Koalja makes the Kubernetes underlay transparent to users (for a `serverless' experience), and offers a breadboarding experience for development ... More

Analysis of the Synergy between Modularity and Autonomy in an Artificial Intelligence Based Fleet CompetitionJul 02 2019A novel approach is provided for evaluating the benefits and burdens from vehicle modularity in fleets/units through the analysis of a game theoretical model of the competition between autonomous vehicle fleets in an attacker-defender game. We present ... More

Are You Doing What I Think You Are Doing? Criticising Uncertain Agent ModelsJul 02 2019The key for effective interaction in many multiagent applications is to reason explicitly about the behaviour of other agents, in the form of a hypothesised behaviour. While there exist several methods for the construction of a behavioural hypothesis, ... More

Scalar Field Estimation with Mobile Sensor NetworksJul 02 2019In this paper, we consider the problem of estimating a scalar field using a network of mobile sensors which can measure the value of the field at their instantaneous location. The scalar field to be estimated is assumed to be represented by positive definite ... More

Online Charge Scheduling for Electric Vehicles in Autonomous Mobility on Demand FleetsJul 01 2019In this paper, we study an online charge scheduling strategy for fleets of autonomous-mobility-on-demand electric vechicles (AMoD EVs). We consider the case where vehicles complete trips and then enter a between-ride state throughout the day, with their ... More

Fast and Reliable Dispersal of Crash-Prone Agents on GraphsJul 01 2019We study the ability of mobile agents performing simple local computations to completely cover an unknown graph environment while implicitly constructing a distributed spanning tree. Whenever an agent moves, it may crash and disappear from the environment. ... More

Collaboration of AI Agents via Cooperative Multi-Agent Deep Reinforcement LearningJun 30 2019There are many AI tasks involving multiple interacting agents where agents should learn to cooperate and collaborate to effectively perform the task. Here we develop and evaluate various multi-agent protocols to train agents to collaborate with teammates ... More

Asymptotic Network Independence in Distributed Optimization for Machine LearningJun 28 2019We provide a discussion of several recent results which have overcome a key barrier in distributed optimization for machine learning. Our focus is the so-called network independence property, which is achieved whenever a distributed method executed over ... More

No-boarding buses: Agents allowed to cooperate or defectJun 28 2019We study a bus system with a no-boarding policy, where a "slow" bus may disallow passengers from boarding if it meets some criteria. When the no-boarding policy is activated, people waiting to board at the bus stop are given the choices of \emph{cooperating} ... More

Engineering Token Economy with System ModelingJun 27 2019Cryptocurrencies and blockchain networks have attracted tremendous attention from their volatile price movements and the promise of decentralization. However, most projects run on business narratives with no way to test and verify their assumptions and ... More

Traffic Management Strategies for Multi-Robotic Rigid Payload Transport SystemsJun 27 2019In this work, we address traffic management of multiple payload transport systems comprising of non-holonomic robots. We consider loosely coupled rigid robot formations carrying a payload from one place to another. Each payload transport system (PTS) ... More

Interactive Physics-Inspired Traffic Congestion ManagementJun 26 2019Jun 30 2019This paper proposes a new physics-based approach to effectively control congestion in a network of interconnected roads (NOIR). The paper integrates mass flow conservation and diffusion-based dynamics to model traffic coordination in a NOIR. The mass ... More

Reasoning about Hypothetical Agent Behaviours and their ParametersJun 26 2019Agents can achieve effective interaction with previously unknown other agents by maintaining beliefs over a set of hypothetical behaviours, or types, that these agents may have. A current limitation in this method is that it does not recognise parameters ... More

On Multi-Agent Learning in Team Sports GamesJun 25 2019In recent years, reinforcement learning has been successful in solving video games from Atari to Star Craft II. However, the end-to-end model-free reinforcement learning (RL) is not sample efficient and requires a significant amount of computational resources ... More

House Markets and Single-Peaked Preferences: From Centralized to Decentralized Allocation ProceduresJun 24 2019Recently, the problem of allocating one resource per agent with initial endowments (\emph{house markets}) has seen a renewed interest: indeed, while in the general domain Top Trading Cycle is known to be the only procedure guaranteeing Pareto-optimality, ... More

Learning to Interactively Learn and AssistJun 24 2019Jul 01 2019When deploying autonomous agents in the real world, we need effective ways of communicating objectives to them. Traditional skill learning has revolved around reinforcement and imitation learning, each with rigid constraints on the format of information ... More

Safe Trajectory Generation for Complex Urban Environments Using Spatio-temporal Semantic CorridorJun 24 2019Planning safe trajectories for autonomous vehicles in complex urban environments is challenging since there are numerous semantic elements (such as dynamic agents, traffic lights and speed limits) to consider. These semantic elements may have different ... More

3D Multi-Robot Patrolling with a Two-Level Coordination StrategyJun 23 2019Teams of UGVs patrolling harsh and complex 3D environments can experience interference and spatial conflicts with one another. Neglecting the occurrence of these events crucially hinders both soundness and reliability of a patrolling process. This work ... More

Privacy Preserving QoE Modeling using Collaborative LearningJun 21 2019Jun 26 2019Machine Learning based Quality of Experience (QoE) models potentially suffer from over-fitting due to limitations including low data volume, and limited participant profiles. This prevents models from becoming generic. Consequently, these trained models ... More

Topology Inference over Networks with Nonlinear CouplingJun 21 2019This work examines the problem of topology inference over discrete-time nonlinear stochastic networked dynamical systems. The goal is to recover the underlying digraph linking the network agents, from observations of their state-evolution. The dynamical ... More

Split Q Learning: Reinforcement Learning with Two-Stream RewardsJun 21 2019Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for a reinforcement learning problem, which extends the standard Q-learning approach to incorporate a two-stream framework of reward ... More

The Complexity of Online Bribery in Sequential ElectionsJun 19 2019Prior work on the complexity of bribery assumes that the bribery happens simultaneously, and that the briber has full knowledge of all voters' votes. But neither of those assumptions always holds. In many real-world settings, votes come in sequentially, ... More

Multi-Agent Pathfinding: Definitions, Variants, and BenchmarksJun 19 2019The MAPF problem is the fundamental problem of planning paths for multiple agents, where the key constraint is that the agents will be able to follow these paths concurrently without colliding with each other. Applications of MAPF include automated warehouses ... More

Scaling in the recovery of urban transportation systems from special eventsJun 19 2019Public transportation is a fundamental infrastructure for the daily mobility in cities. Although its capacity is prepared for the usual demand, congestion may rise when huge crowds concentrate in special events such as massive demonstrations, concerts ... More

Shared Autonomous Vehicle Simulation and Service DesignJun 18 2019Driverless cars are on the way. This technology, allowing more accessible, dynamic and intelligent form of Shared Mobility, is expected to revolutionize urban transportation. One of the conceivable mobility services based on driverless cars are shared ... More

Chemotaxis Based Virtual Fence for Swarm Robots in Unbounded EnvironmentsJun 18 2019This paper presents a novel swarm robotics application of chemotaxis behaviour observed in microorganisms. This approach was used to cause exploration robots to return to a work area around the swarm's nest within a boundless environment. We investigate ... More

Evolutionary Reinforcement Learning for Sample-Efficient Multiagent CoordinationJun 18 2019A key challenge for Multiagent RL (Reinforcement Learning) is the design of agent-specific, local rewards that are aligned with sparse global objectives. In this paper, we introduce MERL (Multiagent Evolutionary RL), a hybrid algorithm that does not require ... More

Protecting Elections by Recounting BallotsJun 17 2019Complexity of voting manipulation is a prominent topic in computational social choice. In this work, we consider a two-stage voting manipulation scenario. First, a malicious party (an attacker) attempts to manipulate the election outcome in favor of a ... More

Simple Swarm Foraging Algorithm Based on Gradient ComputationJun 17 2019Swarm foraging is a common test case application for multi-robot systems. In this paper we present a novel algorithm for controlling swarm robots with limited communication range and storage capacity to efficiently search for and retrieve targets within ... More

A Generic Approach for Accelerating Belief Propagation based DCOP Algorithms via A Branch-and-Bound TechniqueJun 17 2019Belief propagation approaches, such as Max-Sum and its variants, are a kind of important methods to solve large-scale Distributed Constraint Optimization Problems (DCOPs). However, for problems with n-ary constraints, these algorithms face a huge challenge ... More

Learning in Cournot Games with Limited Information FeedbackJun 15 2019In this work, we study the interaction of strategic players in continuous action Cournot games with limited information feedback. Cournot game is the essential market model for many socio-economic systems where players learn and compete without the full ... More

Dynamic Term-Modal Logics for Epistemic PlanningJun 14 2019Classical planning frameworks are built on first-order languages. The first-order expressive power is desirable for compactly representing actions via schemas, and for specifying goal formulas such as $\neg\exists x\mathsf{blocks\_door}(x)$. In contrast, ... More

Extending Eigentrust with the Max-Plus AlgebraJun 13 2019Eigentrust is a simple and widely used algorithm, which quantifies trust based on the repeated application of an update matrix to a vector of initial trust values. In some cases, however, this procedure is rendered uninformative. Here, we characterise ... More

Decentralised Multi-Demic Evolutionary Approach to the Dynamic Multi-Agent Travelling Salesman ProblemJun 13 2019The Travelling Salesman and its variations are some of the most well known NP hard optimisation problems. This paper looks to use both centralised and decentralised implementations of Evolutionary Algorithms (EA) to solve a dynamic variant of the Multi-Agent ... More

Competing Bandits in Matching MarketsJun 12 2019Stable matching, a classical model for two-sided markets, has long been studied with little consideration for how each side's preferences are learned. With the advent of massive online markets powered by data-driven matching platforms, it has become necessary ... More

Deep learning control of artificial avatars in group coordination tasksJun 11 2019In many joint-action scenarios, humans and robots have to coordinate their movements to accomplish a given shared task. Lifting an object together, sawing a wood log, transferring objects from a point to another are all examples where motor coordination ... More

Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement LearningJun 11 2019Recent developments in deep reinforcement learning are concerned with creating decision-making agents which can perform well in various complex domains. A particular approach which has received increasing attention is multi-agent reinforcement learning, ... More

Gossip-based Actor-Learner Architectures for Deep Reinforcement LearningJun 09 2019Multi-simulator training has contributed to the recent success of Deep Reinforcement Learning by stabilizing learning and allowing for higher training throughputs. We propose Gossip-based Actor-Learner Architectures (GALA) where several actor-learners ... More

A Distributed Event-Triggered Control Strategy for DC Microgrids Based on Publish-Subscribe Model Over Industrial Wireless Sensor NetworksJun 09 2019This paper presents a complete design, analysis, and performance evaluation of a novel distributed event-triggered control and estimation strategy for DC microgrids. The primary objective of this work is to efficiently stabilize the grid voltage, and ... More

Federated AI lets a team imagine together: Federated Learning of GANsJun 09 2019Envisioning a new imaginative idea together is a popular human need. Imagining together as a team can often lead to breakthrough ideas, but the collaboration effort can also be challenging, especially when the team members are separated by time and space. ... More

Control-guided Communication: Efficient Resource Arbitration and Allocation in Multi-hop Wireless Control SystemsJun 08 2019In future autonomous systems, wireless multi-hop communication is key to enable collaboration among distributed agents at low cost and high flexibility. When many agents need to transmit information over the same wireless network, communication becomes ... More

A Ride-Matching Strategy For Large Scale Dynamic Ridesharing Services Based on Polar CoordinatesJun 08 2019In this paper, we study a challenging problem of how to pool multiple ride-share trip requests in real time under an uncertain environment. The goals are better performance metrics of efficiency and acceptable satisfaction of riders. To solve the problem ... More

Fair Division Without Disparate ImpactJun 06 2019We consider the problem of dividing items between individuals in a way that is fair both in the sense of distributional fairness and in the sense of not having disparate impact across protected classes. An important existing mechanism for distributionally ... More

A Non-Asymptotic Analysis of Network Independence for Distributed Stochastic Gradient DescentJun 06 2019Jun 28 2019This paper is concerned with minimizing the average of $n$ cost functions over a network, in which agents may communicate and exchange information with their peers in the network. Specifically, we consider the setting where only noisy gradient information ... More

A Class of Distributed Event-Triggered Average Consensus Algorithms for Multi-Agent SystemsJun 06 2019This paper proposes a class of distributed event-triggered algorithms that solve the average consensus problem in multi-agent systems. By designing events such that a specifically chosen Lyapunov function is monotonically decreasing, event-triggered algorithms ... More

Ease-of-Teaching and Language Structure from Emergent CommunicationJun 06 2019Artificial agents have been shown to learn to communicate when needed to complete a cooperative task. Some level of language structure (e.g., compositionality) has been found in the learned communication protocols. This observed structure is often the ... More

Finding Friend and Foe in Multi-Agent GamesJun 05 2019Recent breakthroughs in AI for multi-agent games like Go, Poker, and Dota, have seen great strides in recent years. Yet none of these games address the real-life challenge of cooperation in the presence of unknown and uncertain teammates. This challenge ... More

Maximizing Energy Battery Efficiency in Swarm RoboticsJun 05 2019Miniaturization and cost, two of the main attractive factors of swarm robotics, have motivated its use as a solution in object collecting tasks, search & rescue missions, and other applications. However, in the current literature only a few papers consider ... More

Options as responses: Grounding behavioural hierarchies in multi-agent RLJun 04 2019Jun 06 2019We propose a novel hierarchical agent architecture for multi-agent reinforcement learning with concealed information. The hierarchy is grounded in the concealed information about other players, which resolves "the chicken or the egg" nature of option ... More

Learning Transferable Cooperative Behavior in Multi-Agent TeamsJun 04 2019While multi-agent interactions can be naturally modeled as a graph, the environment has traditionally been considered as a black box. We propose to create a shared agent-entity graph, where agents and environmental entities form vertices, and edges exist ... More

Mobility based network lifetime in wireless sensor networks: A reviewJun 04 2019Increasingly emerging technologies in micro-electromechanical systems and wireless communications allows a mobile wireless sensor networks (MWSN) to be a more and more powerful mean in many applications such as habitat and environmental monitoring, traffic ... More

The Computational Structure of Unintentional MeaningJun 03 2019Speech-acts can have literal meaning as well as pragmatic meaning, but these both involve consequences typically intended by a speaker. Speech-acts can also have unintentional meaning, in which what is conveyed goes above and beyond what was intended. ... More

Massive Styles Transfer with Limited Labeled DataJun 03 2019Language style transfer has attracted more and more attention in the past few years. Recent researches focus on improving neural models targeting at transferring from one style to the other with labeled data. However, transferring across multiple styles ... More

Adaptation and learning over networks under subspace constraints -- Part II: Performance AnalysisJun 01 2019Part I of this paper considered optimization problems over networks where agents have individual objectives to meet, or individual parameter vectors to estimate, subject to subspace constraints that require the objectives across the network to lie in ... More

Average-case Analysis of the Assignment Problem with Independent PreferencesJun 01 2019The fundamental assignment problem is in search of welfare maximization mechanisms to allocate items to agents when the private preferences over indivisible items are provided by self-interested agents. The mainstream mechanism \textit{Random Priority} ... More

The Evolutionary Price of Anarchy: Locally Bounded Agents in a Dynamic Virus GameMay 31 2019The Price of Anarchy (PoA) is a well-established game-theoretic concept to shed light on coordination issues arising in open distributed systems. Leaving agents to selfishly optimize comes with the risk of ending up in sub-optimal states (in terms of ... More

Dynamic Service Composition Orchestrated by Cognitive Agents in Mobile & Pervasive ComputingMay 31 2019Automatic service composition in mobile and pervasive computing faces many challenges due to the complex nature of the environment. Common approaches address service composition from optimization perspectives which are not feasible in practice due to ... More

Attentional Policies for Cross-Context Multi-Agent Reinforcement LearningMay 31 2019Many potential applications of reinforcement learning in the real world involve interacting with other agents whose numbers vary over time. We propose new neural policy architectures for these multi-agent problems. In contrast to other methods of training ... More

A Value-based Trust Assessment Model for Multi-agent SystemsMay 31 2019An agent's assessment of its trust in another agent is commonly taken to be a measure of the reliability/predictability of the latter's actions. It is based on the trustor's past observations of the behaviour of the trustee and requires no knowledge of ... More

New Algorithms for Functional Distributed Constraint Optimization ProblemsMay 30 2019The Distributed Constraint Optimization Problem (DCOP) formulation is a powerful tool to model multi-agent coordination problems that are distributed by nature. The formulation is suitable for problems where variables are discrete and constraint utilities ... More

Ridesharing with Driver Location PreferencesMay 30 2019We study revenue-optimal pricing and driver compensation in ridesharing platforms when drivers have heterogeneous preferences over locations. If a platform ignores drivers' location preferences, it may make inefficient trip dispatches; moreover, drivers ... More

An Introduction to Engineering Multiagent Industrial Symbiosis Systems: Potentials and ChallengesMay 30 2019Multiagent Systems (MAS) research reached a maturity to be confidently applied to real-life complex problems. Successful application of MAS methods for behavior modeling, strategic reasoning, and decentralized governance, encouraged us to focus on applicability ... More

A Simulation Study of Social-Networking-Driven Smart Recommendations for Internet of VehiclesMay 30 2019Social aspects of connectivity and information dispersion are often ignored while weighing the potential of Internet of Things (IoT). In the specialized domain of Internet of Vehicles (IoV), Social IoV (SIoV) is introduced realization its importance. ... More

Cognitively-inspired Agent-based Service Composition for Mobile & Pervasive ComputingMay 29 2019Automatic service composition in mobile and pervasive computing faces many challenges due to the complex and highly dynamic nature of the environment. Common approaches consider service composition as a decision problem whose solution is usually addressed ... More

Modeling Theory of Mind in Multi-Agent Games Using Adaptive Feedback ControlMay 29 2019A major challenge in cognitive science and AI has been to understand how autonomous agents might acquire and predict behavioral and mental states of other agents in the course of complex social interactions. How does such an agent model the goals, beliefs, ... More

Correlation in Extensive-Form Games: Saddle-Point Formulation and BenchmarksMay 29 2019While Nash equilibrium in extensive-form games is well understood, very little is known about the properties of extensive-form correlated equilibrium (EFCE), both from a behavioral and from a computational point of view. In this setting, the strategic ... More

Anti-efficient encoding in emergent communicationMay 29 2019Jun 13 2019Despite renewed interest in emergent language simulations with neural networks, little is known about the basic properties of the induced code, and how they compare to human language. One fundamental characteristic of the latter, known as Zipf's Law of ... More

Robo-Taxi service fleet sizing: assessing the impact of user trust and willingness-to-useMay 29 2019The first commercial fleets of Robo-Taxis will be on the road soon. Today important efforts are made to anticipate future Robo-Taxi services. Fleet size is one of the key parameters considered in the planning phase of service design and configuration. ... More

Scalable and transferable learning of algorithms via graph embedding for multi-robot reward collectionMay 29 2019Can the success of reinforcement learning methods for combinatorial optimization problems be extended to multi-robot scheduling problems in stochastic contexts? Three issues are particularly important in this context: quality of the resulting decisions, ... More