☁️ FDR4VGT-CLOUD

FDR4VGT-CLOUD

Official model release accompanying the manuscript:

A multisensor deep learning framework for robust cloud segmentation in SPOT-VGT and Proba-V Julio Contreras, Cesar Aybar, Luis Gómez-Chova IEEE Geoscience and Remote Sensing Letters (submitted), 2026.

This model is the operational cloud masking algorithm selected for the ESA FDR4VGT SPOT-VGT and Proba-V archives reprocessing, delivering consistent cloud detection across the full SPOT-VGT (VGT1 1998–2003, VGT2 2002–2014) and Proba-V (2013–2020) record —a single sensor-agnostic model for the three missions.

✨ Overview

Architecture: Hybrid DeepLabV3+ (MobileNetV2 backbone) + PixelWise MLP (PW-DL3+)
Input: 4 Top-of-Atmosphere reflectance bands (Blue, Red, NIR, SWIR) — sensor-agnostic
Supported sensors: SPOT-VGT1, SPOT-VGT2, Proba-V
Input shape: [B, 4, 512, 512]
Parameters: 12.65M (57.29 MB)
Training: Weak-to-strong supervision — large-scale pre-training on 3,647 weakly-labeled scenes, followed by fine-tuning on 109 hand-annotated hard-example scenes.

🚀 Quick start

Installation

pip install mlstac rasterio torch==2.5.1

Inference

import torch
import mlstac
import rasterio as rio

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 1. Load the model
framework = mlstac.download(
    file="https://huggingface.co/isp-uv-es/FDR4VGT-CLOUD/resolve/main/single/multisensor_single_1dpwdeeplabv3.json",
    output_dir="FDR4VGT/single",
)
model = framework.model

# 2. Load a 4-band image (Blue, Red, NIR, SWIR)
with rio.open("https://huggingface.co/isp-uv-es/FDR4VGT-CLOUD/resolve/main/ensemble/rgb.tif") as src:
    image = src.read()

# 3. Run large-scene inference (sliding window + Hann blending)
prob = framework.predict_large(
    image=image,
    model=model,
    device=device,
    batch_size=8,     # increase on GPU to speed up; lower on CPU
    num_workers=8,
    nodata=0,         # pixel value treated as invalid/padding
)

# 4. Binarize with the operational threshold
cloud_mask = (prob.squeeze() > 0.5).astype("uint8")

The binarization threshold (default 0.5) can be tuned per use case; the paper uses the F₂-optimal threshold on the validation set.

📊 Performance

Results on the manually-annotated test set (PW-DL3+, Multi-FT strategy) — mean over scenes:

Sensor	F₂	IoU	κ
Proba-V	0.891	0.842	0.808
SPOT-VGT	0.949	0.898	0.829

The model substantially outperforms the legacy BS1 (physical thresholds) and BS2 (pixel-wise MLP) baselines on both sensors, with the largest gain on SPOT-VGT (ΔF₂ = +0.090 over BS1). Temporal analysis across the 1998–2020 archive shows no statistically significant discontinuity at the VGT→Proba-V transition (Mann-Whitney U, p > 0.05), in contrast to the legacy record.

📁 Repository layout

Path	Description
`single/multisensor_single_1dpwdeeplabv3.json`	Operational single-model weights (`PW-DL3+`)
`ensemble/rgb.tif`	Example test scene (4-band TOA reflectance)

📄 Citation

If you use this model, please cite:

@article{contreras2026fdr4vgt,
  title   = {A multisensor deep learning framework for robust cloud segmentation in SPOT-VGT and Proba-V},
  author  = {Contreras, Julio and Aybar, Cesar and G{\'o}mez-Chova, Luis},
  journal = {IEEE Geoscience and Remote Sensing Letters},
  year    = {2026},
}

🙏 Acknowledgements

This work was supported by the European Space Agency (ESA) within the FDR4VGT: Fundamental Data Record for VGT project, leaded by VITO.

Developed at the Image Processing Laboratory (IPL), University of Valencia, Spain.

📜 License

CC0-1.0

Downloads last month: -; Downloads are not tracked for this model. How to track