Choosing the right resampling method for Sentinel-2
Use Resampling.nearest for Sentinel-2 QA masks and the Scene Classification Layer, and Resampling.bilinear (or Resampling.cubic) for continuous reflectance bands. The minimal snippet:
import rasterio
from rasterio.enums import Resampling
# Reflectance band: 20 m → 10 m
with rasterio.open("S2_B05_20m.tif") as src:
data = src.read(
1,
out_shape=(src.height * 2, src.width * 2),
resampling=Resampling.bilinear,
)
# QA / SCL mask: nearest neighbor preserves integer class codes
with rasterio.open("S2_SCL_20m.tif") as src:
scl = src.read(
1,
out_shape=(src.height * 2, src.width * 2),
resampling=Resampling.nearest,
)
Why this choice arises in Sentinel-2 workflows
Sentinel-2’s MultiSpectral Instrument delivers 13 bands at three native resolutions: 10 m (B2, B3, B4, B8), 20 m (B5–B8A, B11, B12), and 60 m (B1, B9, B10). Almost every pixel-wise operation — index calculation, classification, or time-series stacking — requires aligning these to a single common grid. The resampling algorithm you choose controls radiometric fidelity, mask integrity, and whether downstream models see physically meaningful values.
This page is part of the Advanced Resampling & Upscaling Techniques workflow, which sits inside the broader Satellite Processing Workflows & Index Pipelines.
Environment & setup
| Package | Minimum version | Role |
|---|---|---|
rasterio |
1.3 | Band I/O, Resampling enum, in-memory warp |
numpy |
1.24 | Array operations, histogram comparison |
scipy |
1.11 | KS-test for histogram validation (optional) |
Install with:
pip install "rasterio>=1.3" "numpy>=1.24" "scipy>=1.11"
Algorithm reference
Before writing code it helps to understand what each kernel actually computes, because the error modes differ:
-
Resampling.nearest— assigns the closest input pixel’s value without any arithmetic. No fractional outputs are ever produced. Mandatory for the Sentinel-2 Scene Classification Layer (SCL), cloud probability masks, and any integer-coded label layer. Applying bilinear or cubic to these layers generates in-between floats (e.g. 3.7 between cloud class 3 and shadow class 4) that breaknp.unique()checks andastype(np.uint8)conversions. -
Resampling.bilinear— weighted average of the four nearest pixels. Fast, memory-efficient, and the practical default for upscaling 20 m reflectance bands to 10 m. Introduces slight smoothing that suppresses high-frequency sensor noise but can soften abrupt field boundaries. Almost always preferable to cubic when working with cloud and shadow masking pipelines where edge integrity matters only after masking. -
Resampling.cubic— cubic convolution across a 4 × 4 pixel neighbourhood. Sharper edges than bilinear; best for visual composites and precision index derivation. Prone to overshoot (ringing) near high-contrast transitions such as water-land or urban-rural edges. Clamp outputs to[0, 10000]before writing to prevent negative reflectance or values beyond the sensor’s DN range. -
Resampling.average— arithmetic mean of all source pixels that fall inside the target cell. Required when downscaling 10 m data to 25 m, 100 m, or 500 m. Preserves energy conservation. Use for feeding regional models that ingest coarser spatial data. -
Resampling.mode— most frequent integer value in the target cell. The only correct choice for categorical downscaling; prevents artificial class generation thataveragewould produce.
Complete working example
The function below aligns an entire Sentinel-2 L2A scene to a uniform 10 m grid, applying the correct algorithm per layer type. It writes two outputs: a reflectance GeoTIFF stack and a co-registered SCL mask.
from __future__ import annotations
import numpy as np
import rasterio
from rasterio.enums import Resampling
from rasterio.transform import Affine
# Band paths: keys are band names, values are (file_path, is_categorical)
BAND_REGISTRY: dict[str, tuple[str, bool]] = {
"B02": ("S2_B02_10m.tif", False), # Blue reflectance — native 10 m
"B03": ("S2_B03_10m.tif", False), # Green reflectance — native 10 m
"B04": ("S2_B04_10m.tif", False), # Red reflectance — native 10 m
"B08": ("S2_B08_10m.tif", False), # NIR reflectance — native 10 m
"B05": ("S2_B05_20m.tif", False), # Red-edge — must upsample 20 m → 10 m
"B11": ("S2_B11_20m.tif", False), # SWIR — must upsample 20 m → 10 m
"SCL": ("S2_SCL_20m.tif", True), # Scene Classification — categorical
}
TARGET_RESOLUTION_M = 10 # target grid in metres
def resample_band(
src_path: str,
target_height: int,
target_width: int,
is_categorical: bool,
) -> tuple[np.ndarray, Affine, dict]:
"""
Read one Sentinel-2 band and resample it to (target_height, target_width).
Returns the array, the rescaled affine transform, and the rasterio metadata dict.
"""
method = Resampling.nearest if is_categorical else Resampling.bilinear
with rasterio.open(src_path) as src:
# Compute how much the native pixel size differs from the target
native_pixel_m = src.res[0] # e.g. 20.0 for a 20 m band
scale = native_pixel_m / TARGET_RESOLUTION_M # 2.0 when upscaling 20→10
# Read with out_shape triggers GDAL's warp engine inline — no temp file needed
data = src.read(
1,
out_shape=(target_height, target_width),
resampling=method,
)
# Rescale the affine transform so the written GeoTIFF has correct georef
new_transform: Affine = src.transform * src.transform.scale(
src.width / target_width, # x scale factor
src.height / target_height, # y scale factor
)
meta = src.meta.copy()
meta.update(
{
"height": target_height,
"width": target_width,
"transform": new_transform,
"dtype": "int16" if not is_categorical else "uint8",
}
)
return data, new_transform, meta
def build_aligned_stack(
output_reflectance: str,
output_scl: str,
ref_band: str = "B02",
) -> None:
"""
Align all bands to the 10 m grid defined by ref_band, write a reflectance
stack and a co-registered SCL mask.
"""
# Derive target dimensions from the reference (native 10 m) band
with rasterio.open(BAND_REGISTRY[ref_band][0]) as ref:
target_h, target_w = ref.height, ref.width
base_meta = ref.meta.copy()
reflectance_bands = {
name: resample_band(path, target_h, target_w, categorical)
for name, (path, categorical) in BAND_REGISTRY.items()
if not categorical
}
# Write multi-band reflectance GeoTIFF (band order: B02, B03, B04, B08, B05, B11)
band_order = ["B02", "B03", "B04", "B08", "B05", "B11"]
stack = np.stack(
[reflectance_bands[b][0] for b in band_order],
axis=0,
)
out_meta = base_meta.copy()
out_meta.update({"count": len(band_order), "dtype": "int16"})
with rasterio.open(output_reflectance, "w", **out_meta) as dst:
dst.write(stack)
# Write SCL mask
scl_data, _, scl_meta = resample_band(
BAND_REGISTRY["SCL"][0], target_h, target_w, is_categorical=True
)
with rasterio.open(output_scl, "w", **scl_meta) as dst:
dst.write(scl_data, 1)
print(f"Wrote {output_reflectance} shape={stack.shape}")
print(f"Wrote {output_scl} unique SCL classes: {np.unique(scl_data)}")
if __name__ == "__main__":
build_aligned_stack(
output_reflectance="S2_L2A_10m_stack.tif",
output_scl="S2_SCL_10m.tif",
)
Variant patterns
Reprojection + resampling in a single call
When moving from the native Sentinel-2 UTM zone to an equal-area projection such as EPSG:3035 (Europe) or EPSG:6933 (global), chain both operations inside a single rasterio.warp.reproject() call. Splitting them — reproject then resample separately — applies two successive interpolation passes that compound spectral error and inflate RMSE during ground-truth validation. The one-pass approach, covered in the mastering CRS transformations workflow, prevents this.
import numpy as np
import rasterio
from rasterio.crs import CRS
from rasterio.enums import Resampling
from rasterio.warp import calculate_default_transform, reproject
TARGET_CRS = CRS.from_epsg(3035) # ETRS89 / LAEA Europe — equal-area
def reproject_and_resample(
src_path: str,
dst_path: str,
target_res_m: float = 10.0,
is_categorical: bool = False,
) -> None:
"""
Reproject a Sentinel-2 band to TARGET_CRS while resampling to target_res_m
in a single GDAL warp call — avoids double-interpolation.
"""
method = Resampling.nearest if is_categorical else Resampling.bilinear
with rasterio.open(src_path) as src:
transform, width, height = calculate_default_transform(
src.crs,
TARGET_CRS,
src.width,
src.height,
*src.bounds,
resolution=target_res_m, # request explicit pixel size in the target CRS
)
dst_meta = src.meta.copy()
dst_meta.update(
{
"crs": TARGET_CRS,
"transform": transform,
"width": width,
"height": height,
}
)
with rasterio.open(dst_path, "w", **dst_meta) as dst:
for band_idx in src.indexes:
reproject(
source=rasterio.band(src, band_idx),
destination=rasterio.band(dst, band_idx),
src_transform=src.transform,
src_crs=src.crs,
dst_transform=transform,
dst_crs=TARGET_CRS,
resampling=method,
)
Dask-enabled resampling for large scenes or time series
For a full Sentinel-2 tile (110 × 110 km at 10 m = ~100 million pixels per band) or a time-series stack, load via xarray and rioxarray with Dask chunking. This keeps memory bounded and allows parallel computation before writing to a Cloud-Optimized GeoTIFF.
import xarray as xr
import rioxarray # noqa: F401 — registers .rio accessor on xr.DataArray
# Open the 20 m B05 band as a lazy Dask-backed DataArray
b05 = xr.open_dataarray(
"S2_B05_20m.tif",
engine="rasterio",
chunks={"x": 2048, "y": 2048}, # ~16 MB per chunk at int16
)
# Upsample to 10 m — rio.reproject_match aligns to a reference DataArray
b02 = xr.open_dataarray("S2_B02_10m.tif", engine="rasterio")
# reproject_match uses bilinear by default; pass Resampling.nearest for masks
b05_10m = b05.rio.reproject_match(b02)
# Compute and write; Dask executes in parallel across chunks
b05_10m.rio.to_raster("S2_B05_10m_dask.tif", driver="GTiff")
Aggregating to coarser grids for regional modeling
When feeding a spectral index calculation pipeline at coarser scales (e.g. integrating with MODIS or ERA5 data), simple pixel centre sampling discards sub-pixel information. Use Resampling.average to preserve energy conservation across the aggregated cell:
import rasterio
from rasterio.enums import Resampling
SCALE_FACTOR = 25 # 10 m → 250 m
with rasterio.open("S2_NDVI_10m.tif") as src:
out_h = src.height // SCALE_FACTOR
out_w = src.width // SCALE_FACTOR
ndvi_250m = src.read(
1,
out_shape=(out_h, out_w),
resampling=Resampling.average, # area-weighted mean over 25×25 cells
)
new_transform = src.transform * src.transform.scale(
src.width / out_w,
src.height / out_h,
)
meta = src.meta.copy()
meta.update({"height": out_h, "width": out_w, "transform": new_transform})
with rasterio.open("S2_NDVI_250m.tif", "w", **meta) as dst:
dst.write(ndvi_250m, 1)
Common errors
ValueError: cannot convert float NaN to integer
Caused by applying Resampling.bilinear to the SCL layer, which produces fractional values, followed by astype(np.uint8). Fix: always pass Resampling.nearest for any layer opened from an SCL or QA file.
Negative reflectance values after cubic resampling
Cubic convolution overshoots near high-contrast edges (water-land, urban-rural). Clamp immediately after reading: data = np.clip(data, 0, 10000). Consider switching to bilinear for coastal or mixed study areas where edge artifacts compound.
rasterio.errors.CRSError: Input shapes do not share CRS
Occurs when stacking a reprojected band (e.g. EPSG:3035) with a band still in its native UTM zone. Reproject all layers to the common CRS before resampling. Keep a single canonical CRS defined at pipeline entry and propagate it through every step using rasterio.warp.reproject().
Related
- Advanced Resampling & Upscaling Techniques — parent workflow covering the full range of interpolation strategies, super-resolution, and multi-sensor fusion.
- Cloud & Shadow Masking Strategies — apply SCL-based masks after resampling to avoid edge-mismatch artifacts.
- Spectral Index Calculation Pipelines — build NDVI, EVI, and SWIR indices on the aligned band stack produced here.
- Mastering CRS Transformations in rasterio — one-pass reproject-and-resample patterns that avoid double-interpolation error.
- Handling Pixel Resolution & Scaling — affine transform mechanics and scaling factors that underpin every resampling call.