Dienst van SURF
© 2025 SURF
Key to reinforcement learning in multi-agent systems is the ability to exploit the fact that agents only directly influence only a small subset of the other agents. Such loose couplings are often modelled using a graphical model: a coordination graph. Finding an (approximately) optimal joint action for a given coordination graph is therefore a central subroutine in cooperative multi-agent reinforcement learning (MARL). Much research in MARL focuses on how to gradually update the parameters of the coordination graph, whilst leaving the solving of the coordination graph up to a known typically exact and generic subroutine. However, exact methods { e.g., Variable Elimination { do not scale well, and generic methods do not exploit the MARL setting of gradually updating a coordination graph and recomputing the joint action to select. In this paper, we examine what happens if we use a heuristic method, i.e., local search, to select joint actions in MARL, and whether we can use outcome of this local search from a previous time-step to speed up and improve local search. We show empirically that by using local search, we can scale up to many agents and complex coordination graphs, and that by reusing joint actions from the previous time-step to initialise local search, we can both improve the quality of the joint actions found and the speed with which these joint actions are found.
LINK
Artificially intelligent agents increasingly collaborate with humans in human-agent teams. Timely proactive sharing of relevant information within the team contributes to the overall team performance. This paper presents a machine learning approach to proactive communication in AI-agents using contextual factors. Proactive communication was learned in two consecutive experimental steps: (a) multi-agent team simulations to learn effective communicative behaviors, and (b) human-agent team experiments to refine communication suitable for a human team member. Results consist of proactive communication policies for communicating both beliefs and goals within human-agent teams. Agents learned to use minimal communication to improve team performance in simulation, while they learned more specific socially desirable behaviors in the human-agent team experiment
Autonomously exploring and mapping is one of the open challenges of robotics and artificial intelligence. Especially when the environments are unknown, choosing the optimal navigation directive is not straightforward. In this paper, we propose a reinforcement learning framework for navigating, exploring, and mapping unknown environments. The reinforcement learning agent is in charge of selecting the commands for steering the mobile robot, while a SLAM algorithm estimates the robot pose and maps the environments. The agent, to select optimal actions, is trained to be curious about the world. This concept translates into the introduction of a curiosity-driven reward function that encourages the agent to steer the mobile robot towards unknown and unseen areas of the world and the map. We test our approach in explorations challenges in different indoor environments. The agent trained with the proposed reward function outperforms the agents trained with reward functions commonly used in the literature for solving such tasks.
MULTIFILE
Teachers have a crucial role in bringing about the extensive social changes that are needed in the building of a sustainable future. In the EduSTA project, we focus on sustainability competences of teachers. We strengthen the European dimension of teacher education via Digital Open Badges as means of performing, acknowledging, documenting, and transferring the competencies as micro-credentials. EduSTA starts by mapping the contextual possibilities and restrictions for transformative learning on sustainability and by operationalising skills. The development of competence-based learning modules and open digital badge-driven pathways will proceed hand in hand and will be realised as learning modules in the partnering Higher Education Institutes and badge applications open for all teachers in Europe.Societal Issue: Teachers’ capabilities to act as active facilitators of change in the ecological transition and to educate citizens and workforce to meet the future challenges is key to a profound transformation in the green transition.Teachers’ sustainability competences have been researched widely, but a gap remains between research and the teachers’ practise. There is a need to operationalise sustainability competences: to describe direct links with everyday tasks, such as curriculum development, pedagogical design, and assessment. This need calls for an urgent operationalisation of educators’ sustainability competences – to support the goals with sustainability actions and to transfer this understanding to their students.Benefit to society: EduSTA builds a community, “Academy of Educators for Sustainable Future”, and creates open digital badge-driven learning pathways for teachers’ sustainability competences supported by multimodal learning modules. The aim is to achieve close cooperation with training schools to actively engage in-service teachers.Our consortium is a catalyst for leading and empowering profound change in the present and for the future to educate teachers ready to meet the challenges and act as active change agents for sustainable future. Emphasizing teachers’ essential role as a part of the green transition also adds to the attractiveness of teachers’ work.
The Dutch main water systems face pressing environmental, economic and societal challenges due to climatic changes and increased human pressure. There is a growing awareness that nature-based solutions (NBS) provide cost-effective solutions that simultaneously provide environmental, social and economic benefits and help building resilience. In spite of being carefully designed and tested, many projects tend to fail along the way or never get implemented in the first place, wasting resources and undermining trust and confidence of practitioners in NBS. Why do so many projects lose momentum even after a proof of concept is delivered? Usually, failure can be attributed to a combination of eroding political will, societal opposition and economic uncertainties. While ecological and geological processes are often well understood, there is almost no understanding around societal and economic processes related to NBS. Therefore, there is an urgent need to carefully evaluate the societal, economic, and ecological impacts and to identify design principles fostering societal support and economic viability of NBS. We address these critical knowledge gaps in this research proposal, using the largest river restoration project of the Netherlands, the Border Meuse (Grensmaas), as a Living Lab. With a transdisciplinary consortium, stakeholders have a key role a recipient and provider of information, where the broader public is involved through citizen science. Our research is scientifically innovative by using mixed methods, combining novel qualitative methods (e.g. continuous participatory narrative inquiry) and quantitative methods (e.g. economic choice experiments to elicit tradeoffs and risk preferences, agent-based modeling). The ultimate aim is to create an integral learning environment (workbench) as a decision support tool for NBS. The workbench gathers data, prepares and verifies data sets, to help stakeholders (companies, government agencies, NGOs) to quantify impacts and visualize tradeoffs of decisions regarding NBS.
Om de zorg goed, betaalbaar en toegankelijk te houden moet de norm worden: zelf als het kan, thuis als het kan en digitaal als het kan. In de praktijk van de ouderenzorg wordt de potentie van ‘digitaal’ herkend en hebben in veel organisaties change agents (ook wel innovatiemanagers, projectmanagers, aandachtsvelders genoemd) de taak gekregen ehealth te implementeren. Duurzame implementatie - het daadwerkelijk landen in de 'normale' werkprocessen en organisatie - van dergelijke technologieën is echter complex en stokt vaak na de pilot-fase. Change agents geven aan dat bestaande modellen en stappenplannen voor implementatie de weerbarstigheid van de daadwerkelijke praktijk onvoldoende omvatten en dat inzetten van iets nieuws terwijl de kwaliteit van zorg geborgd moet blijven niet goed samen gaat. Voor innovatie moet je kunnen experimenteren en moet het ook fout mogen gaan, en dat kan niet met de kwetsbare doelgroep waar de woonzorgorganisaties voor ouderen mee te maken hebben. Het project “Train de change agent: S(t)imulatie van implementatie van digitale technologie in de ouderenzorg” richt zich op het voorkomen van die handelingsverlegenheid van change agents in de complexe veranderprocessen waar het bij de implementatie van digitale technologie in deze context om gaat. Samen met kennisinstellingen, ouderenzorgorganisaties en MKB willen we een simulatie-leeromgeving voor change agents ontwikkelen waarin zij op basis van authentieke casuïstiek op een veilige manier de vaardigheden kunnen leren voor het duurzaam implementeren van ehealth. Gebaseerd op behoeftepeilingen in de zorg, zal in “Train de change agent” de focus liggen op ehealth tools / digitale technologie die de zelfredzaamheid en kwaliteit van leven van cliënten vergroot en zo bijdraagt aan reablement, of arbeidsbesparing voor professionals realiseert. De vraag die in dit project centraal staat is “Hoe kunnen we doormiddel van simulatie change agents beter toerusten op het bepalen en bijstellen van strategieën voor implementatie van digitale zorgtechnologie in de dynamische en hoog complexe contexten waarbinnen zorg wordt geleverd?” De simulatie-leeromgeving zal zo ingericht worden dat deze recht doet aan de complexiteit van duurzame implementatie van deze technologie, en dat diep leren via reflection-on-action en reflection-in-action mogelijk is. De vorm en inhoud van de leeromgeving wordt van begin af aan samen ontwikkeld met change agents in de ouderenzorg. Er zal gewerkt worden in en aan een learning community, om het van en met elkaar leren in het project een goede plek te geven, en tevens de randvoorwaarden te scheppen voor duurzame implementatie van de te ontwikkelen leeromgeving. In een iteratief proces zal bij experts en change agents respectievelijk de kaders en de casuïstiek worden opgehaald, de technische eisen voor de leeromgeving worden bepaald en de inhoud van de simulatie en andere benodigde materialen worden ontwikkeld, met tussentijds steeds testmomenten met de eindgebruikers (= change agents). Het project zal een looptijd hebben van 2 jaar