Diffusion-Based Learning and Foundation Model Adaptation for Robust Dense Prediction in Earth Observation
Time: Mon 2026-05-11 14.00
Location: D37, Lindstedtsvägen 5, Stockholm
Video link: https://kth-se.zoom.us/j/66145987135
Language: English
Subject area: Geodesy and Geoinformatics, Geoinformatics
Doctoral student: Ali Shibli , Geoinformatik
Opponent: Professor Srinivasan Keshav, University of Cambridge, UK
Supervisor: Professor Yifang Ban, Geoinformatik; Associate Professor Andrea Nascetti, Geoinformatik
QC 20260427
Abstract
Dense prediction tasks such as semantic segmentation, change detection, and wildfire burned-area mapping are central to Earth observation, yet deep learning models trained for these tasks frequently degrade under the geographic, temporal, and spatiotemporal distribution shifts encountered in real-world deployment. This thesis investigates how diffusion-based learning and parameter-efficient adaptation can improve the robustness and generalization of dense prediction models for Earth observation, with a particular focus on wildfire monitoring using Sentinel-2 imagery.
Three complementary studies are presented in this thesis. The first introduces Noise2Map, a discriminative diffusion model that repurposes structured noise as a supervisory signal for semantic segmentation and change detection. Unlike prior diffusion approaches that require iterative sampling, Noise2Map performs single-pass inference while achieving rank-1 performance across three benchmarks and being 13.5× faster than the closest diffusion baseline. The second study proposes a diffusion-based decoder that operates in the representation space of frozen geospatial foundation models (GFMs) to improve zero-shot generalization for wildfire burned-area mapping. The diffusion decoder improves performance in 14 out of 16 backbone–protocol–region combinations, with gains of up to +4.8 F1, and extends to out-of-distribution European wildfires not seen during training. The third study systematically evaluates adaptation strategies for GFMs (full fine-tuning, decoder-only fine-tuning, and Low-Rank Adaptation (LoRA)) for large-scale wildfire mapping across North America. LoRA consistently outperforms all alternatives, improving IoU by up to +9.35 over full fine-tuning for Prithvi-v2, while keeping more than 99% of backbone parameters frozen.
Together, these studies show that constraining how models learn, through structured noise, frozen encoders, or low-rank updates, generalizes better than training more parameters. Diffusion-based learning and parameter-efficient adaptation offer practical, complementary paths toward robust Earth observation.