Passive acoustic monitoring (PAM) has become a powerful tool for studying wildlife by continuously recording environmental soundscapes. However, analysing large acoustic datasets remains highly time-consuming, as recordings are often annotated manually by domain experts. In this work, we investigate how machine learning can support scalable biodiversity monitoring by enabling efficient population assessment and species identification from PAM data.
We introduce practical workflows that combine transfer learning, active learning, and embedding-based analysis to reduce annotation effort while maintaining high-quality results. For population assessment, we develop a semi-automated annotation pipeline that identifies vocalisations of target species with significantly fewer manually labelled samples [1]. For species identification, we formalise the task of novel class identification (NCI) [2], enabling the discovery of previously unknown species classes in large, unlabelled datasets.
To demonstrate real-world applicability, we develope prototype bioacoustic analysis tools and evaluate them using field recordings from Brazil, including data from the Fernando de Noronha archipelago and the Amazon rainforest. Our results show that the proposed workflows can reduce annotation effort by approximately a factor of 2–4 and support intuitive exploration of large acoustic datasets. One of the developed systems was used during the XPRIZE Rainforest competition by the Brazilian team [3], which achieved third place among more than 300 participating teams.
This work provides a foundation for next-generation bioacoustic analysis tools that combine machine learning and expert knowledge to enable scalable, quantitative wildlife monitoring and biodiversity assessment.
We received the “Best Student Paper Award” at GoodIT 2024 for this work. [read the paper]

Passive acoustic monitoring recording device deployed in Serra do Cipó National Park, Brazil

Layout of the user interface developed for the XPRIZE Rainforest competition. (left) Sample representation of the selected audio snippet as a spectrogram with playback functionality. (right) Annotation area featuring selection panels for the query method, competence class, and species list
References
[1]: Hannes Kath, Patricia P. Serafini, Ivan B. Campos, Thiago S. Gouvêa, and Daniel Sonntag. 2024f. Leveraging transfer learning and active learning for data annotation in passive acoustic monitoring of wildlife. Ecological Informatics 82 (2024), 102710. https://doi.org/10.1016/j.ecoinf.2024.102710
[2]: Hannes Kath, Thiago S. Gouvêa, and Daniel Sonntag. 2024a. Active and Transfer Learning for Efficient Identification of Species in Multi-Label Bioacoustic Datasets. In Proceedings of the 2024 International Conference on Information Technology for Social Good (Bremen, Germany) (GoodIT ’24). Association for Computing Machinery, New York, NY, USA, 22–25. https://doi.org/10.1145/3677525.3678635
[3]: Hannes Kath, Ilira Troshani, Bengt Lüers, Thiago S. Gouvêa, and Daniel Sonntag. 2024g. Enhancing Biodiversity Monitoring: An Interactive Tool for Efficient Identification of Species in Large Bioacoustics Datasets. In Companion Proceedings of the 26th International Conference on Multimodal Interaction (San Jose, Costa Rica) (ICMI ’24 Companion). Association for Computing Machinery, New York, NY, USA, 91–93. https://doi.org/10.1145/3686215.3688374