Implementing Fmask and s2cloudless in Python
Combine a physics-based Fmask QA band with the s2cloudless machine-learning probability map using a logical OR merge: a pixel is masked if either detector flags it. This hybrid approach outperforms either method alone, especially for semi-transparent cirrus and topographic shadow. For broader context on when to use this strategy versus simpler SCL-flag approaches, see Cloud and Shadow Masking Strategies.
from s2cloudless import S2PixelCloudDetector
import numpy as np, rasterio
def hybrid_mask(band_paths, fmask_path, threshold=0.45):
stack = np.stack([
rasterio.open(p).read(1).astype(np.float32) / 10000.0
for p in band_paths
], axis=0) # (10, H, W)
inp = stack.transpose(1, 2, 0)[np.newaxis] # (1, H, W, 10)
prob = S2PixelCloudDetector(threshold=threshold, all_bands=True
).get_cloud_probability_maps(inp)[0] # (H, W)
with rasterio.open(fmask_path) as src:
qa = src.read(1)
return (prob >= threshold) | (qa == 4) | (qa == 2)
Why this combination arises in remote sensing pipelines
Single-model cloud detection fails in predictable ways. Fmask relies on physical spectral thresholds — brightness temperature, NDVI ratios — which are reliable for thick convective cloud and geometric shadow projection but miss thin cirrus that lacks a thermal signal. s2cloudless uses a LightGBM classifier trained on Sentinel-2 top-of-atmosphere (TOA) reflectance and handles cirrus well, but can underdetect shadows and small opaque clouds where the spectral signature is ambiguous.
This pairing is therefore standard inside Satellite Processing Workflows & Index Pipelines: run both detectors in parallel, union their boolean outputs, and pass the result downstream before any spectral index calculation or temporal aggregation so cloud-contaminated pixels never reach the analytics layer.
The data-flow below shows how the two detectors feed into the fusion step:
Environment and setup
| Package | Minimum version | Purpose |
|---|---|---|
s2cloudless |
1.5.0 | ML cloud probability from Sentinel-2 L1C |
rasterio |
1.3.0 | Band I/O and COG output |
numpy |
1.24 | Array operations and boolean fusion |
lightgbm |
3.3 | Installed automatically by s2cloudless |
pip install s2cloudless>=1.5.0 rasterio>=1.3.0 numpy>=1.24
Band order is strict. s2cloudless with all_bands=True requires exactly these 10 bands in order: B01, B02, B04, B05, B08, B8A, B09, B10, B11, B12. Passing bands in the wrong order silently degrades accuracy without raising an error.
Fmask is a CLI tool. The official Python bindings are deprecated and unstable outside conda-managed environments. Invoke Fmask via subprocess before running this script, then point fmask_path at the generated *_fmask.tif QA file.
Complete working example
The function below processes a full Sentinel-2 scene using chunked rasterio windows to keep memory bounded. All band files are read once at the start; Fmask QA is decoded in a single pass; s2cloudless inference runs tile by tile.
import numpy as np
import rasterio
from rasterio.windows import Window
from s2cloudless import S2PixelCloudDetector
from pathlib import Path
def load_and_scale_bands(band_paths: list[str]) -> tuple[np.ndarray, dict]:
"""
Load Sentinel-2 L1C bands into a float32 stack normalized to [0, 1].
band_paths must follow s2cloudless band order:
B01, B02, B04, B05, B08, B8A, B09, B10, B11, B12
"""
with rasterio.open(band_paths[0]) as src:
profile = src.profile.copy()
height, width = src.height, src.width
# Sentinel-2 L1C quantification value: DN / 10000 = TOA reflectance
stack = np.zeros((len(band_paths), height, width), dtype=np.float32)
for i, path in enumerate(band_paths):
with rasterio.open(path) as src:
stack[i] = np.clip(src.read(1, out_dtype=np.float32) / 10000.0, 0.0, 1.0)
return stack, profile
def decode_fmask_qa(qa_path: str) -> np.ndarray:
"""
Return a boolean array where True means cloud or cloud shadow.
Fmask 4.x categorical values:
0 = clear land, 1 = water, 2 = cloud shadow,
3 = snow/ice, 4 = cloud, 255 = fill/nodata
"""
with rasterio.open(qa_path) as src:
qa = src.read(1)
return (qa == 4) | (qa == 2)
def generate_hybrid_mask(
band_paths: list[str],
fmask_qa_path: str,
output_path: str,
cloud_threshold: float = 0.45,
chunk_size: int = 1024,
) -> None:
"""
Write a uint8 boolean cloud-shadow mask by merging s2cloudless
probability and Fmask QA via logical OR.
Parameters
----------
band_paths : 10 Sentinel-2 L1C band files in s2cloudless order
fmask_qa_path : path to pre-generated Fmask *_fmask.tif
output_path : destination COG path (uint8, 0=clear, 1=masked)
cloud_threshold : s2cloudless probability cut-off (default 0.45)
chunk_size : tile side length in pixels for chunked inference
"""
stack, profile = load_and_scale_bands(band_paths)
height, width = stack.shape[1], stack.shape[2]
# Instantiate detector once — the LightGBM model loads at construction
detector = S2PixelCloudDetector(threshold=cloud_threshold, all_bands=True)
# Load Fmask for the full scene in one read (fast sequential access)
fmask_full = decode_fmask_qa(fmask_qa_path)
# Output is uint8 boolean; deflate compression saves ~70 % vs uncompressed
out_profile = profile.copy()
out_profile.update(dtype="uint8", count=1, compress="deflate", nodata=None)
with rasterio.open(output_path, "w", **out_profile) as dst:
for row in range(0, height, chunk_size):
for col in range(0, width, chunk_size):
win = Window(
col, row,
min(chunk_size, width - col),
min(chunk_size, height - row),
)
# ── s2cloudless inference ────────────────────────────────────
# Slice shape: (bands, H, W) → transpose → (H, W, bands)
# Add batch dimension: (1, H, W, bands) as expected by the API
chunk = stack[
:,
win.row_off : win.row_off + win.height,
win.col_off : win.col_off + win.width,
].transpose(1, 2, 0)[np.newaxis]
# get_cloud_probability_maps returns (n_samples, H, W)
prob = detector.get_cloud_probability_maps(chunk)[0]
ml_mask = prob >= cloud_threshold
# ── Fmask QA slice ───────────────────────────────────────────
fmask_chunk = fmask_full[
win.row_off : win.row_off + win.height,
win.col_off : win.col_off + win.width,
]
# ── Fusion: flag pixel if EITHER detector marks it ───────────
hybrid = (ml_mask | fmask_chunk).astype(np.uint8)
dst.write(hybrid, 1, window=win)
Variant patterns
Fmask 5.x bit-packed QA
Fmask 5.x switched from categorical integers to a bit-packed uint8 for compact storage. Replace the equality checks with bitwise masks:
def decode_fmask5_qa(qa_path: str) -> np.ndarray:
"""Decode Fmask 5.x bit-packed QA band (check sensor-specific docs for bit definitions)."""
with rasterio.open(qa_path) as src:
qa = src.read(1)
# Bit 0 = cloud, bit 1 = cloud shadow (verify against your Fmask 5.x release notes)
fmask_cloud = (qa & 0b00000001) > 0
fmask_shadow = (qa & 0b00000010) > 0
return fmask_cloud | fmask_shadow
Always cross-check the bit positions against the Fmask algorithm reference for your sensor version — bit definitions differ between Landsat and Sentinel-2 outputs.
Dask-enabled variant for large AOIs
When the scene footprint exceeds available RAM, replacing the explicit chunk loop with rioxarray and Dask parallelizes tile processing across CPU cores without manual window management. This pattern integrates naturally with Advanced Resampling and Upscaling Techniques when spatial resolution harmonization is required before masking.
import rioxarray
import xarray as xr
import dask.array as da
def dask_hybrid_mask(
band_paths: list[str],
fmask_qa_path: str,
cloud_threshold: float = 0.45,
chunks: dict = {"x": 1024, "y": 1024},
) -> xr.DataArray:
"""
Build the hybrid mask lazily with Dask.
Call .compute() to materialise or .rio.to_raster(...) to write.
"""
# Load bands as a lazy Dask-backed DataArray
arrays = [
rioxarray.open_rasterio(p, chunks=chunks).squeeze("band", drop=True) / 10000.0
for p in band_paths
]
stack = xr.concat(arrays, dim="band") # (band, y, x)
# Fmask (small enough to load eagerly)
with rasterio.open(fmask_qa_path) as src:
fmask_np = src.read(1)
fmask_da = da.from_array(fmask_np, chunks=(chunks["y"], chunks["x"]))
fmask_mask = (fmask_da == 4) | (fmask_da == 2)
# s2cloudless does not accept Dask arrays directly — map_blocks bridges the gap
def _apply_detector(block: np.ndarray) -> np.ndarray:
# block shape: (bands, H, W)
inp = block.transpose(1, 2, 0)[np.newaxis]
prob = S2PixelCloudDetector(
threshold=cloud_threshold, all_bands=True
).get_cloud_probability_maps(inp)[0]
return (prob >= cloud_threshold).astype(np.uint8)
ml_mask = da.map_blocks(
_apply_detector,
stack.data, # (bands, y, x) Dask array
dtype=np.uint8,
drop_axis=0, # output drops the band dimension
)
hybrid = da.logical_or(ml_mask, fmask_mask).astype(np.uint8)
return xr.DataArray(hybrid, coords={"y": stack.y, "x": stack.x})
Applying the mask to downstream index calculations
Once the boolean mask is written, apply it before computing any vegetation or water index so contaminated pixels are excluded from statistics. This step connects directly to Spectral Index Calculation Pipelines:
import numpy as np
import rasterio
def apply_cloud_mask_to_band(
band_path: str,
mask_path: str,
nodata_value: float = np.nan,
) -> tuple[np.ndarray, dict]:
"""Replace cloud/shadow pixels with nodata before index arithmetic."""
with rasterio.open(band_path) as src:
band = src.read(1, out_dtype=np.float32)
profile = src.profile.copy()
with rasterio.open(mask_path) as src:
mask = src.read(1).astype(bool)
band[mask] = nodata_value
profile.update(dtype="float32", nodata=nodata_value)
return band, profile
Common errors
ValueError: Input data has incorrect shape — s2cloudless received an array with the wrong number of bands or a shape that is not (n, H, W, 10). Verify that band_paths contains exactly 10 files in the documented order and that the transpose step produces (H, W, 10) before the np.newaxis batch dimension is added.
rasterio.errors.CRSError: Unable to open EPSG support file — PROJ environment variables are not set. Resolve with export PROJ_LIB=$(python -c "import pyproj; print(pyproj.datadir.get_data_dir())"), or install the pyproj wheel which bundles PROJ data automatically.
MemoryError during full-scene band load — The default approach loads all 10 bands into RAM before chunking. For 60 m/20 m mixed-resolution scenes at full swath width (110 km), reduce chunk_size to 512 or switch to the Dask variant above.
Frequently asked questions
Which Sentinel-2 bands does s2cloudless require?
s2cloudless expects exactly 10 bands in this order: B01, B02, B04, B05, B08, B8A, B09, B10, B11, B12, all normalized to [0, 1] TOA reflectance. Omitting B10 (cirrus) in particular removes the detector’s primary cirrus signal.
What do Fmask categorical values mean? In Fmask 4.x: 0 = clear land, 1 = water, 2 = cloud shadow, 3 = snow/ice, 4 = cloud, 255 = fill/nodata. Fmask 5.x switches to bit-packed encoding where individual bits represent each class.
Can I call Fmask directly from Python?
The official Python bindings are deprecated and unstable outside conda-managed environments. Production workflows invoke Fmask via subprocess, then read the resulting *_fmask.tif QA band with rasterio. Direct in-memory execution via ctypes or pybind11 is not supported in modern environments.
How should I tune the cloud threshold?
The default 0.45 balances precision and recall for mid-latitude Sentinel-2 scenes. In tropical regions or areas with persistent haze, lower the threshold to 0.35 and validate against a stratified random sample of manually labelled pixels. A well-tuned hybrid mask typically achieves F1 ≥ 0.92 for clouds and F1 ≥ 0.88 for shadows across diverse biomes.
Related
- Cloud and Shadow Masking Strategies — parent page covering the full masking strategy landscape including SCL flags, morphological dilation, and scene-level QA scoring.
- Satellite Processing Workflows & Index Pipelines — the broader processing pipeline this masking step feeds into, from atmospheric correction through spectral index output.
- Spectral Index Calculation Pipelines — the downstream stage that consumes cloud-free pixels to compute NDVI, EVI, NDWI, and related indices.
- Advanced Resampling and Upscaling Techniques — covers resolution harmonization for multi-resolution Sentinel-2 bands before stacking for cloud detection.
- Automated Image Clipping and Cropping — apply spatial subsetting before masking to reduce memory overhead when processing large scene footprints.