Choosing the right resampling method for Sentinel-2

Use Resampling.nearest for Sentinel-2 QA masks and the Scene Classification Layer, and Resampling.bilinear (or Resampling.cubic) for continuous reflectance bands. The minimal snippet:

import rasterio
from rasterio.enums import Resampling

# Reflectance band: 20 m → 10 m
with rasterio.open("S2_B05_20m.tif") as src:
    data = src.read(
        1,
        out_shape=(src.height * 2, src.width * 2),
        resampling=Resampling.bilinear,
    )

# QA / SCL mask: nearest neighbor preserves integer class codes
with rasterio.open("S2_SCL_20m.tif") as src:
    scl = src.read(
        1,
        out_shape=(src.height * 2, src.width * 2),
        resampling=Resampling.nearest,
    )

Why this choice arises in Sentinel-2 workflows

Sentinel-2’s MultiSpectral Instrument delivers 13 bands at three native resolutions: 10 m (B2, B3, B4, B8), 20 m (B5–B8A, B11, B12), and 60 m (B1, B9, B10). Almost every pixel-wise operation — index calculation, classification, or time-series stacking — requires aligning these to a single common grid. The resampling algorithm you choose controls radiometric fidelity, mask integrity, and whether downstream models see physically meaningful values.

This page is part of the Advanced Resampling & Upscaling Techniques workflow, which sits inside the broader Satellite Processing Workflows & Index Pipelines.


Sentinel-2 Resampling Method Decision Guide Flowchart showing how to choose a resampling algorithm based on whether the Sentinel-2 layer contains continuous reflectance, categorical mask data, or is being aggregated to a coarser grid. Sentinel-2 layer Data type? (reflectance vs. class) categorical continuous aggregating ↓ Nearest Neighbor SCL, cloud masks, land cover classes Bilinear / Cubic reflectance, NDVI, spectral indices Average / Mode downscale to 100 m+ for regional modeling Rule: reproject first → resample to target grid → mask → compute indices Never resample after index calculation; never double-interpolate by reprojecting post-resample

Environment & setup

Package Minimum version Role
rasterio 1.3 Band I/O, Resampling enum, in-memory warp
numpy 1.24 Array operations, histogram comparison
scipy 1.11 KS-test for histogram validation (optional)

Install with:

pip install "rasterio>=1.3" "numpy>=1.24" "scipy>=1.11"

Algorithm reference

Before writing code it helps to understand what each kernel actually computes, because the error modes differ:

  • Resampling.nearest — assigns the closest input pixel’s value without any arithmetic. No fractional outputs are ever produced. Mandatory for the Sentinel-2 Scene Classification Layer (SCL), cloud probability masks, and any integer-coded label layer. Applying bilinear or cubic to these layers generates in-between floats (e.g. 3.7 between cloud class 3 and shadow class 4) that break np.unique() checks and astype(np.uint8) conversions.

  • Resampling.bilinear — weighted average of the four nearest pixels. Fast, memory-efficient, and the practical default for upscaling 20 m reflectance bands to 10 m. Introduces slight smoothing that suppresses high-frequency sensor noise but can soften abrupt field boundaries. Almost always preferable to cubic when working with cloud and shadow masking pipelines where edge integrity matters only after masking.

  • Resampling.cubic — cubic convolution across a 4 × 4 pixel neighbourhood. Sharper edges than bilinear; best for visual composites and precision index derivation. Prone to overshoot (ringing) near high-contrast transitions such as water-land or urban-rural edges. Clamp outputs to [0, 10000] before writing to prevent negative reflectance or values beyond the sensor’s DN range.

  • Resampling.average — arithmetic mean of all source pixels that fall inside the target cell. Required when downscaling 10 m data to 25 m, 100 m, or 500 m. Preserves energy conservation. Use for feeding regional models that ingest coarser spatial data.

  • Resampling.mode — most frequent integer value in the target cell. The only correct choice for categorical downscaling; prevents artificial class generation that average would produce.

Complete working example

The function below aligns an entire Sentinel-2 L2A scene to a uniform 10 m grid, applying the correct algorithm per layer type. It writes two outputs: a reflectance GeoTIFF stack and a co-registered SCL mask.

from __future__ import annotations

import numpy as np
import rasterio
from rasterio.enums import Resampling
from rasterio.transform import Affine

# Band paths: keys are band names, values are (file_path, is_categorical)
BAND_REGISTRY: dict[str, tuple[str, bool]] = {
    "B02": ("S2_B02_10m.tif", False),  # Blue reflectance — native 10 m
    "B03": ("S2_B03_10m.tif", False),  # Green reflectance — native 10 m
    "B04": ("S2_B04_10m.tif", False),  # Red reflectance — native 10 m
    "B08": ("S2_B08_10m.tif", False),  # NIR reflectance — native 10 m
    "B05": ("S2_B05_20m.tif", False),  # Red-edge — must upsample 20 m → 10 m
    "B11": ("S2_B11_20m.tif", False),  # SWIR — must upsample 20 m → 10 m
    "SCL": ("S2_SCL_20m.tif", True),   # Scene Classification — categorical
}

TARGET_RESOLUTION_M = 10  # target grid in metres


def resample_band(
    src_path: str,
    target_height: int,
    target_width: int,
    is_categorical: bool,
) -> tuple[np.ndarray, Affine, dict]:
    """
    Read one Sentinel-2 band and resample it to (target_height, target_width).

    Returns the array, the rescaled affine transform, and the rasterio metadata dict.
    """
    method = Resampling.nearest if is_categorical else Resampling.bilinear

    with rasterio.open(src_path) as src:
        # Compute how much the native pixel size differs from the target
        native_pixel_m = src.res[0]                # e.g. 20.0 for a 20 m band
        scale = native_pixel_m / TARGET_RESOLUTION_M  # 2.0 when upscaling 20→10

        # Read with out_shape triggers GDAL's warp engine inline — no temp file needed
        data = src.read(
            1,
            out_shape=(target_height, target_width),
            resampling=method,
        )

        # Rescale the affine transform so the written GeoTIFF has correct georef
        new_transform: Affine = src.transform * src.transform.scale(
            src.width / target_width,   # x scale factor
            src.height / target_height, # y scale factor
        )

        meta = src.meta.copy()
        meta.update(
            {
                "height": target_height,
                "width": target_width,
                "transform": new_transform,
                "dtype": "int16" if not is_categorical else "uint8",
            }
        )

    return data, new_transform, meta


def build_aligned_stack(
    output_reflectance: str,
    output_scl: str,
    ref_band: str = "B02",
) -> None:
    """
    Align all bands to the 10 m grid defined by ref_band, write a reflectance
    stack and a co-registered SCL mask.
    """
    # Derive target dimensions from the reference (native 10 m) band
    with rasterio.open(BAND_REGISTRY[ref_band][0]) as ref:
        target_h, target_w = ref.height, ref.width
        base_meta = ref.meta.copy()

    reflectance_bands = {
        name: resample_band(path, target_h, target_w, categorical)
        for name, (path, categorical) in BAND_REGISTRY.items()
        if not categorical
    }

    # Write multi-band reflectance GeoTIFF (band order: B02, B03, B04, B08, B05, B11)
    band_order = ["B02", "B03", "B04", "B08", "B05", "B11"]
    stack = np.stack(
        [reflectance_bands[b][0] for b in band_order],
        axis=0,
    )

    out_meta = base_meta.copy()
    out_meta.update({"count": len(band_order), "dtype": "int16"})

    with rasterio.open(output_reflectance, "w", **out_meta) as dst:
        dst.write(stack)

    # Write SCL mask
    scl_data, _, scl_meta = resample_band(
        BAND_REGISTRY["SCL"][0], target_h, target_w, is_categorical=True
    )
    with rasterio.open(output_scl, "w", **scl_meta) as dst:
        dst.write(scl_data, 1)

    print(f"Wrote {output_reflectance}  shape={stack.shape}")
    print(f"Wrote {output_scl}  unique SCL classes: {np.unique(scl_data)}")


if __name__ == "__main__":
    build_aligned_stack(
        output_reflectance="S2_L2A_10m_stack.tif",
        output_scl="S2_SCL_10m.tif",
    )

Variant patterns

Reprojection + resampling in a single call

When moving from the native Sentinel-2 UTM zone to an equal-area projection such as EPSG:3035 (Europe) or EPSG:6933 (global), chain both operations inside a single rasterio.warp.reproject() call. Splitting them — reproject then resample separately — applies two successive interpolation passes that compound spectral error and inflate RMSE during ground-truth validation. The one-pass approach, covered in the mastering CRS transformations workflow, prevents this.

import numpy as np
import rasterio
from rasterio.crs import CRS
from rasterio.enums import Resampling
from rasterio.warp import calculate_default_transform, reproject

TARGET_CRS = CRS.from_epsg(3035)  # ETRS89 / LAEA Europe — equal-area


def reproject_and_resample(
    src_path: str,
    dst_path: str,
    target_res_m: float = 10.0,
    is_categorical: bool = False,
) -> None:
    """
    Reproject a Sentinel-2 band to TARGET_CRS while resampling to target_res_m
    in a single GDAL warp call — avoids double-interpolation.
    """
    method = Resampling.nearest if is_categorical else Resampling.bilinear

    with rasterio.open(src_path) as src:
        transform, width, height = calculate_default_transform(
            src.crs,
            TARGET_CRS,
            src.width,
            src.height,
            *src.bounds,
            resolution=target_res_m,  # request explicit pixel size in the target CRS
        )

        dst_meta = src.meta.copy()
        dst_meta.update(
            {
                "crs": TARGET_CRS,
                "transform": transform,
                "width": width,
                "height": height,
            }
        )

        with rasterio.open(dst_path, "w", **dst_meta) as dst:
            for band_idx in src.indexes:
                reproject(
                    source=rasterio.band(src, band_idx),
                    destination=rasterio.band(dst, band_idx),
                    src_transform=src.transform,
                    src_crs=src.crs,
                    dst_transform=transform,
                    dst_crs=TARGET_CRS,
                    resampling=method,
                )

Dask-enabled resampling for large scenes or time series

For a full Sentinel-2 tile (110 × 110 km at 10 m = ~100 million pixels per band) or a time-series stack, load via xarray and rioxarray with Dask chunking. This keeps memory bounded and allows parallel computation before writing to a Cloud-Optimized GeoTIFF.

import xarray as xr
import rioxarray  # noqa: F401  — registers .rio accessor on xr.DataArray

# Open the 20 m B05 band as a lazy Dask-backed DataArray
b05 = xr.open_dataarray(
    "S2_B05_20m.tif",
    engine="rasterio",
    chunks={"x": 2048, "y": 2048},  # ~16 MB per chunk at int16
)

# Upsample to 10 m — rio.reproject_match aligns to a reference DataArray
b02 = xr.open_dataarray("S2_B02_10m.tif", engine="rasterio")

# reproject_match uses bilinear by default; pass Resampling.nearest for masks
b05_10m = b05.rio.reproject_match(b02)

# Compute and write; Dask executes in parallel across chunks
b05_10m.rio.to_raster("S2_B05_10m_dask.tif", driver="GTiff")

Aggregating to coarser grids for regional modeling

When feeding a spectral index calculation pipeline at coarser scales (e.g. integrating with MODIS or ERA5 data), simple pixel centre sampling discards sub-pixel information. Use Resampling.average to preserve energy conservation across the aggregated cell:

import rasterio
from rasterio.enums import Resampling

SCALE_FACTOR = 25  # 10 m → 250 m

with rasterio.open("S2_NDVI_10m.tif") as src:
    out_h = src.height // SCALE_FACTOR
    out_w = src.width // SCALE_FACTOR

    ndvi_250m = src.read(
        1,
        out_shape=(out_h, out_w),
        resampling=Resampling.average,  # area-weighted mean over 25×25 cells
    )

    new_transform = src.transform * src.transform.scale(
        src.width / out_w,
        src.height / out_h,
    )

    meta = src.meta.copy()
    meta.update({"height": out_h, "width": out_w, "transform": new_transform})

    with rasterio.open("S2_NDVI_250m.tif", "w", **meta) as dst:
        dst.write(ndvi_250m, 1)

Common errors

ValueError: cannot convert float NaN to integer Caused by applying Resampling.bilinear to the SCL layer, which produces fractional values, followed by astype(np.uint8). Fix: always pass Resampling.nearest for any layer opened from an SCL or QA file.

Negative reflectance values after cubic resampling Cubic convolution overshoots near high-contrast edges (water-land, urban-rural). Clamp immediately after reading: data = np.clip(data, 0, 10000). Consider switching to bilinear for coastal or mixed study areas where edge artifacts compound.

rasterio.errors.CRSError: Input shapes do not share CRS Occurs when stacking a reprojected band (e.g. EPSG:3035) with a band still in its native UTM zone. Reproject all layers to the common CRS before resampling. Keep a single canonical CRS defined at pipeline entry and propagate it through every step using rasterio.warp.reproject().