The use case aims to enhance the robustness and reliability of automated biodiversity image processing and classification, with a particular focus on plankton and deep-water animal imagery. By leveraging decentralised Swarm Learning techniques, the approach seeks to overcome the current fragmentation caused by bespoke classifiers trained on limited datasets. In doing so, it will allow for the development of models that are less biased and more capable of handling diverse or rare organisms. In addition, DEAL pursues data-related objectives by accessing third-party biodiversity imagery to enrich the training process and by verifying the architecture requirements of the DEAL framework, ensuring seamless integration with the AI4EOSC infrastructure.
Decentralised Learning (DEAL)

Aim
Development actions during iMagine
Deployment of a DEAL node within the iMagine AI Platform to support distributed learning on biodiversity imagery.
Integration of third-party plankton and deep-water animal image datasets into the Swarm Learning workflow.
Development of decentralised classification models with improved robustness and reduced bias.
Validation of cloud computing requirements, including data storage, access, and scalability within the platform and AI4EOSC.
Strengthening collaboration across biodiversity research communities through privacy-preserving data sharing approaches.
Expected Results
The expected outcomes include the development of more accurate and generalisable models for plankton and deep-water animal classification, achieved through collaborative, decentralised learning that maintains data privacy. The use of external imagery sources will strengthen the DEAL models, broadening their applicability across different environments and reducing performance gaps when confronted with new or rare species.
Furthermore, the deployment of a DEAL node within the AI4EOSC framework will validate the cloud computing design, particularly in terms of data storage, access, and interoperability. Ultimately, the project is expected to deliver both technical advancements in biodiversity image analysis and a scalable infrastructure model that addresses concerns around data ownership, privacy, and duplication.