Abstract
We consider a multi-robot setting, where we have a fleet of multi-capacityautonomous robots that must service spatially distributed pickup-and-deliveryrequests with fixed maximum wait times. Requests can be either scheduled aheadof time or they can enter the system in real-time. In this setting, stabilityfor a routing policy is defined as the cost of the policy being uniformlybounded over time. Most previous work either solve the problem offline totheoretically maintain stability or they consider dynamically arriving requestsat the expense of the theoretical guarantees on stability. In this paper, weaim to bridge this gap by proposing a novel proactive rollout-based routingframework that adapts to real-time demand while still provably maintaining thestability of the learned routing policy. We derive provable stabilityguarantees for our method by proposing a fleet sizing algorithm that obtains asufficiently large fleet that ensures stability by construction. To validateour theoretical results, we consider a case study on real ride requests forHarvard's evening Van System. We also evaluate the performance of our frameworkusing the currently deployed smaller fleet size. In this smaller setup, wecompare against the currently deployed routing algorithm, greedy heuristics,and Monte-Carlo-Tree-Search-based algorithms. Our empirical results show thatour framework maintains stability when we use the sufficiently large fleet sizefound in our theoretical results. For the smaller currently deployed fleetsize, our method services 6% more requests than the closest baseline whilereducing median passenger wait times by 33%.