Learning to Communicate and Collaborate in a Competitive Multi-Agent Setup to Clean the Ocean from Macroplastics

Between 1950 and 2019, plastic production increased to 460 million tonnes yearly (Ritchie & Roser, 2018). About 12 million tonnes end up in the ocean (Stafford & Jones, 2019; Cressey, 2016; Eunomia, 2016). If the plastic production rate continues to rise, plastic will outweigh fish by 2050 (WEF, 2016). More than half of the plastic floating in the ocean, i.e. macroplastics, is less dense than the water and will not sink to the ground (Lebreton et al., 2017). This means that the buoyant plastic mass is within the top meters of the ocean surface (Reisser et al., 2015; Kooi et al., 2016). There are several reasons why this is harmful to marine life, but also to humans and the marine ecosystem. Around 800 coastal but also marine species are affected by plastics in the ocean directly (Harding, 2016). 17% of the species affected are on the International Union for Conservation of Nature (IUCN) list of endangered species (Gall & Thompson, 2015). Animals are specifically affected by a series of causes. Entanglement can restrict movement and the ability to feed, causing infections and may ultimately lead to death (Fisheries, 2021). Furthermore, animals ingest plastics as they mistake the colour and shape of debris for food. When microplastics get entangled in algae and seaweed, the combination produces an odour that attracts animals (Pfaller et al., 2020). Specifically, microplastics may look like plankton which is food for many species. Even small organisms like polyps living in corals consume microplastics (Rotjan et al., 2019). Annually 100,000 marine mammals and a million seabirds die because of plastic waste (Martin, 2023). Importantly, when humans consume seafood, they also consume plastic toxins (Smith et al., 2018), which can be linked to hormonal abnormalities and development problems (Rochester, 2013). Furthermore, there is a concern that plastics in the ocean will degrade to nano-plastics which could enter human cells (Mattsson et al., 2015). One of the largest accumulations of garbage in the ocean is the Great Pacific Garbage Patch (GPGP), first found and named by Charles J. Moore, a competitive sailor while returning from a race in 1997. The GPGP measures an area three times the size of France (sally.gruger, 2015), more precisely, the cover surface is estimated at 1.6 million square kilometres (Lebreton et al., 2018).

To scale the GPGP, a company named The Ocean Cleanup had to utilize a fleet of 30 boats, 652 surface nets and two flights. The size and dynamic properties make this a great use case for a highly distributed system, including multiple agents that have learned to take independent actions. We believe efforts in cleaning up oceans and rivers can benefit from MARL systems. However, for higher planning precision and efficiency, we believe a communication mechanism can contribute significantly. We have been inspired by the world of animals, e.g. the way a dog communicates by wagging its tail or the red colour code of an octopus in an alert state. We propose a highly distributed MARL system with a dynamic GNN communication layer, allowing pairs of agents to observe the garbage density of simulated satellite data and actively communicate signals as part of their action space.

This work has been published as part of the Tackling Climate Change with Machine Learning Workshop at ICLR 2023: url. The Ocean Plastic Collector environment can also be explored as a WebApp here: url

Environment Walkthrough

Poster Presentation at ICLR 2023

Drone based Reforestation Environment